Neat little python string trick

Discussion in 'Scripting & Programming' started by ffreeloader, May 26, 2008.

  1. mgoldwasser

    mgoldwasser New Member

    6
    2
    2
    Hi ffreeloader,

    It depends upon what you want the contents of the tuple to be. What you've done here is to create a tuple that has a single element, namely the original string. I'm not sure that I see much advantage in having such a tuple.

    A related goal might be to create a tuple of individual characters from a string. You can do that using the syntax such as charTuple = tuple(alphabet). To convert a tuple of characters back into a string, you can use the join method of the string class, invoked as origString = "".join(charTuple)
     
  2. mgoldwasser

    mgoldwasser New Member

    6
    2
    2
    :biggrin
     
  3. ffreeloader

    ffreeloader Terabyte Poster

    3,661
    106
    167
    That works great. Thanks. Much better than my first idea. Funny how a few years of doing something gives much better results than 15 minutes of experimentation when first learning things.... :biggrin

    But still my question remains what the security implications in my coding with respect to this? I ask because in the discussions I read(these guys were using for loops and splits() to do conversions which seemed very convoluted to me that's why I started playing with the conversion the way I did it) they were really stressing the security implications of what could come from an untrusted source. When is the better type of data to validate if you're doing a conversion from string to tuple or vice versa?
     
    Certifications: MCSE, MCDBA, CCNA, A+
    WIP: LPIC 1
  4. mgoldwasser

    mgoldwasser New Member

    6
    2
    2
    My guess is that you were reading discussion on more general security issues. User-supplied data should indeed not be trusted if being used for example when using a script to make an operating system call or database query. Another security concern, especially in improperly written C/C++ code, is a buffer overflow attack in which arbitrarily long input from the user is stored in a fixed-length character array. Yet this does not seem related to the internal conversions between strings and tuples within Python. Perhaps there was more context in the discussion you read.
     
  5. Mathematix

    Mathematix Megabyte Poster

    969
    35
    74
    Believe it or not this is not the best way to sort these items in your linear time. I can propose a solution in O(n/2) just off the top of my head. Let me show you by example:

    You have a string: 'abcd'

    Algorithm: Per iteration take the outmost elements of the character array and swap, working your way inward.

    Demo: (swapped elements in bold)

    Step 1 - 'dbca'
    Step 2 - 'dcba'

    Final reversed string: 'dcba'

    So we have four elements reversed in two steps! (O(n/2))

    Therefore:

    Worst case: O(n) which is linear time (swapping the elements one-by-one).
    Best case: O(n/2) which is swapping the element one pair at a time per iteration.

    This is the very point that I am getting at. People assume Python (or whatever) provide the best solutions when there is always someone out there who can go one better. :biggrin


    Good point, but not the whole story. Yes, Python may have been written in C but that does not automatically mean that it is even likely to be more efficient. I'm not sure how experienced you are in the language, but various features (memory allocation, overuse of the STL, etc.) can actually leave you with less than efficient code. Also the code generated by the Python interpreter will never match compiled code by its very nature. This can't be helped, but we are still in the early days of Python, so I'd be interested to see how in compares with say, Javascript for instance.


    Thanks for that. When I said 'hidden' I meant that the implementations are hidden away from the average user since they don't care about the details of implementation. :)
     
    Certifications: BSc(Hons) Comp Sci, BCS Award of Merit
    WIP: Not doing certs. Computer geek.
  6. Mathematix

    Mathematix Megabyte Poster

    969
    35
    74
    On the contrary, Freddie! It's just that 'modern programming' cares little for the best way to perform a task at the lowest level - one of the downsides of today's super-fast computers. I'm in my mid-thirties, so narrowly caught how things used to be before object orientation took hold and programmers were forced to code using proper paradigms, or else suffer the consequences of bloated code with loads of bottlenecks. :)
     
    Certifications: BSc(Hons) Comp Sci, BCS Award of Merit
    WIP: Not doing certs. Computer geek.
  7. dmarsh

    dmarsh Petabyte Poster

    4,086
    413
    219
    I'm with harry and mgoldwasser, reversing the string should be (and by look of it is) a linear time operation. Think STL iterators or indeed the aforementioned loops. Its a bounded loop with direction, nothing more complex, no need to consider sorting whatsoever.

    The datastructure and implementation in theory could well affect performance but I think its reasonable to assume that immutable strings in general would be backed by an array, and also that the library writers would use the most sensible efficient implementation.

    I think mgoldwasser's point regarding C implementation is that language/library writers often have some advantages when writing code and will nearly always generate more efficient implementations of standard algorithms, its not always good to reinvent the wheel, but it is good to know how to if needed...
     
  8. Mathematix

    Mathematix Megabyte Poster

    969
    35
    74
    But I just proved a better than linear time operation! :blink

    I give up, guys. :)
     
    Certifications: BSc(Hons) Comp Sci, BCS Award of Merit
    WIP: Not doing certs. Computer geek.
  9. mgoldwasser

    mgoldwasser New Member

    6
    2
    2
    I'm a bit at a loss to know how to respond...

    • We have been discussing reversing not sorting
    • The operation in question is not swapping characters in place, rather generating a new string that contains the reversed sequence of characters.
    • In reality, all you've done is suggest a way to accomplish something using n/2 "steps" with 2 or 3 operations per step, rather than n steps with 1 operation per.
    • Please do yourself and us a favor and review the definition of O( ) notation before claiming "improvements" from O(n) to O(n/2)

    Overall, there is validity to the point that programmers should understand the tools that they use, or else risk the inefficiencies or errors that come with misusing them. As such, I tend to concur that in your case it might be best not to use them.
     
  10. Mathematix

    Mathematix Megabyte Poster

    969
    35
    74
    Sorting applies to reversing elements! :blink

    For the method at hand, whether they are being swapped in-place or being transferred to a new string is irrelevant.

    Ummm, if I got it wrong argue with Knuth, not me! :biggrin It's two operations per iteration!

    Dammit you guys make me work hard! :biggrin
     
    Certifications: BSc(Hons) Comp Sci, BCS Award of Merit
    WIP: Not doing certs. Computer geek.
  11. Mathematix

    Mathematix Megabyte Poster

    969
    35
    74
    No mate. I know I said that I would give up, but I have to make a point:

    1. The problem would be linear if and only if the letters were first in a completely random order and the requirement were to sort them in reverse alphabetical order.

    2. Given that the string was already in alphabetical order, part of the work was already done, so half the effort was required to reverse the order, hence the n/2 complexity that the author disagrees with. :dry
     
    Certifications: BSc(Hons) Comp Sci, BCS Award of Merit
    WIP: Not doing certs. Computer geek.
  12. dmarsh

    dmarsh Petabyte Poster

    4,086
    413
    219
    No you are wrong, the definition of the problem was not to sort, it was to REVERSE. The alphabet string was just used as an example candidate string to demonstrate the reverse. There is no need to sort the string, not because its already 'sorted' but because that is not part of the requirements, the string could of been 'gobledegook' and the reversal would be 'koogedelbog', no sort is needed.

    You are diverting the thread to disscuss O notation of popular sorting algorithms when the algorithm is a reverse algorithm. The linear time complexity is less than the logarithmic complexity of a sorting algorithm.
    A simple observation for reverse might be N chars will require N reads, N writes and N termination comparisions so N*3 operations. Worst case for bubble sort would be something like N pow 2 comparisons and swaps (read*2&comp&write*2) so (N pow2)*5. Average case will of course be less and prob some logarithmic function. http://www.nist.gov/dads/HTML/bubblesort.html - Has bubble sort as O(npow2), quicksort is N log N.
    http://warpelican.com/?page_id=3
     
  13. Mathematix

    Mathematix Megabyte Poster

    969
    35
    74
    Dude, I don't know where you get off. I have proved you wrong on many occasions before and have done so again.

    When you rearrange the order of elements according to a set of criteria, and the position of those elements are considered sorted when that criteria is met, then they are sorted. When you look on the web or in a book, those are examples of the fundamental sorting algorithms. these are not the be all and end all of the possibilities of sorting! Jeez...


    I wasn't using any of the sorting algorithms described above. Go back an read what I did! The facts are there! Stop digging a hole for yourself, for heavens sake! :x
     
    Certifications: BSc(Hons) Comp Sci, BCS Award of Merit
    WIP: Not doing certs. Computer geek.
  14. tripwire45
    Honorary Member

    tripwire45 Zettabyte Poster

    13,493
    180
    287
    Ok you two...step back for a sec. From my experience, you are both very intelligent guys who generally know what you're talking about. Take a moment and try to understand where the other guy is coming from. Is it that he's wrong or is there some sort of misunderstanding operating here that's causing you to be at loggerheads. If you are in a disagreement, so be it, but there's absolutely no reason to personalize it. Please keep that in mind. A friendly message from "the management". :wink:
     
    Certifications: A+ and Network+
  15. dmarsh

    dmarsh Petabyte Poster

    4,086
    413
    219
    I was just trying to clarify the thread topic, thats what a forums for is it not ?

    Geeeezz take a chill pill man, when I am wrong or mistaken I am more than willing to admit I am wrong, thats what heathly debate is all about no ? :D "Change your mind, prove you have one..."

    Never said they were did I if you bother to read the posts. The point is the fact that the example string happened to be sorted in this case is irrelevant to the reverse algorithm under disscussion.

    This is a type of sort no ? O notation is generally concerned with worst case,

    Step 1 - bdca
    Step 2 - adcb
    Step 3 - acdb
    Step 4 - abdc
    Step 5 - abcd

    As you can see scanning from the ends with one sweep is not sufficient for truely random data, your algorithm would require multiple passes to succeed and could result in swapping elements that were in the correct location.

    A general algorithm will need at best N log N swaps in worst case, best case (sorted) zero swaps, note number comparisons can be a lot larger than the number of swaps (best case log2(n!) ?), therefore its no where near n/2 for worst case and your algortihm is fundamentally broken. You are assuming the data is part sorted or partitioned.

    http://en.wikipedia.org/wiki/Sorting_algorithm

    No usable sorting algorithm comes close to you level of performance of O(n/2).

    N log N is the general lower bound for most sorting algorithms, for 4 elements roughly 2.4 which is more than 2 ie N/2.

    You could try some other algorithms to get closer to your measure, however in reality these performance measurements are skewed to large datasets because they discount the cost of the 'minor' operations. For small datasets like in our examples these operations are not minor and the timings for many popular algorithms will be very similar.
     
  16. Mathematix

    Mathematix Megabyte Poster

    969
    35
    74
    Point taken, Trip.

    One of my primary responsibilities at work is optimisation/reverse engineering, both in terms of algorithms and resource allocation. They respect my work and knowledge a great deal - there is no reason why it should be any different here. I'm just a little tired of knowing exactly what I'm talking about and people coming along trying to teach an old dog new tricks (although I'm not an 'old dog') which are incorrect in the first place.

    Seriously, if the general public on certforums want to thrive on incorrect/incomplete information on computing science then so bid. It's disgraceful that I should be challenged for something that I've especially clearly proven.

    Rant over. :slidedrin
     
    Certifications: BSc(Hons) Comp Sci, BCS Award of Merit
    WIP: Not doing certs. Computer geek.
  17. dmarsh

    dmarsh Petabyte Poster

    4,086
    413
    219
    Where have you proven that N/2 is lower limit O notation for a sort ? What exactly have you proven ?

    Your opinion or mine alone do not count as fact or even a consensus. Nobody deserves respect they have to earn it, and it has to be earned in each new domain/forum (excuse the pun.).

    I'm not trying to teach you anything, you started to try to teach several of your elders (old dogs?) that reversing a string is fundamentally related to sort when it is not.

    Thats precisely why myself, hbromhall and mgoldwasser have voiced our opinions, because we want certforums readers to have as complete a picture as possible given the nature of a forum.
     
  18. Mathematix

    Mathematix Megabyte Poster

    969
    35
    74
    You guys said that reversing the string is a linear time process in its best case, right? Linear time is O(n). I showed via an alternative algorithm that it is infact not a linear problem, in that elements can be swapped as shown, given the criteria in half the time O(n/2). We even have Knuth's works here at our studio, and I have them at home. I've double-checked my definitions even on the web, asked a colleague who sits directly next to me. None of them disagree.

    My colleague actually thought it was quite cool how I reveresed the string.

    I completely agree. But then I put to you what is the purpose of proof if you still cannot earn respect for your findings. I'm not asking to be treated like a god, just not to be pressed further for proof when I have already clearly provided one that others have accepted.

    I was referring to myself when saying 'old dog'. How is reversing a string not fundamentally related to sort?

    I agree that eveyone should give their perspective, otherwise there wouldn't be room for healthy discussion. I have admitted in the passed when I've made mistakes on these forums, and it's clear that others have a clear problem doing so. Fromm your last posts I can pick out a number of errors and misunderstandings, but to avoid WW3 I'm steering clear.

    If you want me to say you're right, then here you go: "You're right, dmarsh." (Even though I don't believe it for one moment. ;) )
     
    Certifications: BSc(Hons) Comp Sci, BCS Award of Merit
    WIP: Not doing certs. Computer geek.
  19. dmarsh

    dmarsh Petabyte Poster

    4,086
    413
    219
    Maybe in the most purest sense yes, however linear in a more general term means 'proportional', non-exponential, non logarithmic, not a curve, ie it simply means it can be represented as a line on a graph, the are many more possible functions that would generate a line. I'd be the first to admit my math and O notation knowledge is poor.

    This is also linear, just its a linear relationship that 'ramps' at half the rate. People who therefore stated it was a 'linear' relationship are therefore still correct.

    You maybe can do one swap sweep at O(n/2), you cannot sort though, sort requires comparison and potentialy more swaps. Read any text on sort, it will state N log N as lower bound.

    Replace sort with reverse and we are in total agreement. However a swap algorithm will probably still perform worse than a reverse iterator copy. A copy would require N operations vs your alleged N/2 but they would most likely be cheaper operations. You would probably require more conditional logic, plus the original method was framed in terms of 'immutable strings' hence hinting at a copy, why copy and swap when you can just copy ?

    I do not believe I have seen any 'proof' ? If you are as good at maths as you say you will understand how hard a good proof is to construct. Who has accepted what exactly ?

    Reversing a string is generally changing the order of elements in memory so that character elements at all indexes are swapped around the midpoint. You do not need to even inspect the contents of the elements, there need be no comparison or collation. You need not even swap, a straight copy is enough providing you have a reverse iterator. A copy with no comparision is therefore not anything like a sort, it is however linear and time taken is directly related to the number of elements.

    Feel free to point out my misunderstandings or poor explanations, I never said my understanding of the subject was legendary, neither was I rude or abbrasive in any of my posts.
     
  20. Mathematix

    Mathematix Megabyte Poster

    969
    35
    74
    As said, dmarsh. You're right. Now go play.
     
    Certifications: BSc(Hons) Comp Sci, BCS Award of Merit
    WIP: Not doing certs. Computer geek.

Share This Page

Loading...
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.