Splitting a python string at a delimiter but a specific one












6















Is there a way to split a python string without using a for loop that basically splits a string in the middle to the closest delimiter.



Like:



The cat jumped over the moon very quickly.


The delimiter would be the space and the resulting strings would be:



The cat jumped over
the moon very quickly.


I see there is a count where I can see how many spaces are in there (Don't see how to return their indexes though). I could then find the middle one by dividing by two, but then how to say split on this delimiter at this index. Find is close but it returns the first index (or right first index using rfind) not all the indexes where " " is found. I might be over thinking this.










share|improve this question


















  • 3





    What about using split() and the re-joining the first and second half of the resulting list separately?

    – Feodoran
    Jan 21 at 23:07






  • 2





    The algorithm you define (counting the spaces) would split the sentence into equal number of words, which conflicts with your requirement (split a string in the middle to the closest delimiter). Which one are you after?

    – Selcuk
    Jan 21 at 23:19
















6















Is there a way to split a python string without using a for loop that basically splits a string in the middle to the closest delimiter.



Like:



The cat jumped over the moon very quickly.


The delimiter would be the space and the resulting strings would be:



The cat jumped over
the moon very quickly.


I see there is a count where I can see how many spaces are in there (Don't see how to return their indexes though). I could then find the middle one by dividing by two, but then how to say split on this delimiter at this index. Find is close but it returns the first index (or right first index using rfind) not all the indexes where " " is found. I might be over thinking this.










share|improve this question


















  • 3





    What about using split() and the re-joining the first and second half of the resulting list separately?

    – Feodoran
    Jan 21 at 23:07






  • 2





    The algorithm you define (counting the spaces) would split the sentence into equal number of words, which conflicts with your requirement (split a string in the middle to the closest delimiter). Which one are you after?

    – Selcuk
    Jan 21 at 23:19














6












6








6








Is there a way to split a python string without using a for loop that basically splits a string in the middle to the closest delimiter.



Like:



The cat jumped over the moon very quickly.


The delimiter would be the space and the resulting strings would be:



The cat jumped over
the moon very quickly.


I see there is a count where I can see how many spaces are in there (Don't see how to return their indexes though). I could then find the middle one by dividing by two, but then how to say split on this delimiter at this index. Find is close but it returns the first index (or right first index using rfind) not all the indexes where " " is found. I might be over thinking this.










share|improve this question














Is there a way to split a python string without using a for loop that basically splits a string in the middle to the closest delimiter.



Like:



The cat jumped over the moon very quickly.


The delimiter would be the space and the resulting strings would be:



The cat jumped over
the moon very quickly.


I see there is a count where I can see how many spaces are in there (Don't see how to return their indexes though). I could then find the middle one by dividing by two, but then how to say split on this delimiter at this index. Find is close but it returns the first index (or right first index using rfind) not all the indexes where " " is found. I might be over thinking this.







python






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Jan 21 at 23:01









CodejoyCodejoy

1,30183868




1,30183868








  • 3





    What about using split() and the re-joining the first and second half of the resulting list separately?

    – Feodoran
    Jan 21 at 23:07






  • 2





    The algorithm you define (counting the spaces) would split the sentence into equal number of words, which conflicts with your requirement (split a string in the middle to the closest delimiter). Which one are you after?

    – Selcuk
    Jan 21 at 23:19














  • 3





    What about using split() and the re-joining the first and second half of the resulting list separately?

    – Feodoran
    Jan 21 at 23:07






  • 2





    The algorithm you define (counting the spaces) would split the sentence into equal number of words, which conflicts with your requirement (split a string in the middle to the closest delimiter). Which one are you after?

    – Selcuk
    Jan 21 at 23:19








3




3





What about using split() and the re-joining the first and second half of the resulting list separately?

– Feodoran
Jan 21 at 23:07





What about using split() and the re-joining the first and second half of the resulting list separately?

– Feodoran
Jan 21 at 23:07




2




2





The algorithm you define (counting the spaces) would split the sentence into equal number of words, which conflicts with your requirement (split a string in the middle to the closest delimiter). Which one are you after?

– Selcuk
Jan 21 at 23:19





The algorithm you define (counting the spaces) would split the sentence into equal number of words, which conflicts with your requirement (split a string in the middle to the closest delimiter). Which one are you after?

– Selcuk
Jan 21 at 23:19












7 Answers
7






active

oldest

votes


















1














how about something like this:



s = "The cat jumped over the moon very quickly"

l = s.split()

s1 = ' '.join(l[:len(l)//2])
s2 = ' '.join(l[len(l)//2 :])

print(s1)
print(s2)





share|improve this answer
























  • This method will collapse consecutive spaces into a single space. "First␣sentence.␣␣And␣then␣the␣second." will split the string into "First␣sentence.␣And" and "then␣the␣second.". Notice the collapsed double space after the first sentence. Tabs t and newlines n will also be converted to a single space when joined. Using s.split(" ") will split on every individual space, which would maintain consecutive spaces when joined, but you then have a problem when halving the split string.

    – Billy Brown
    Jan 22 at 8:54











  • This also breaks down for "sentence with looooooooooooooooooong word and many very short ones"

    – tobias_k
    Jan 22 at 9:35



















5














This should work:



def split_text(text):
middle = len(text)//2
under = text.rfind(" ", 0, middle)
over = text.find(" ", middle)
if over > under and under != -1:
return (text[:,middle - under], text[middle - under,:])
else:
if over is -1:
raise ValueError("No separator found in text '{}'".format(text))
return (text[:,middle + over], text[middle + over,:])


it does not use a for loop, but probably using a for loop would have better performance.



I handle the case where the separator is not found in the whole string by raising an error, but change raise ValueError() for whatever way you want to handle that case.






share|improve this answer





















  • 2





    You mean under = text.rfind(" ", 0, middle).

    – CristiFati
    Jan 22 at 0:05








  • 1





    Right, I will edit it.

    – spaniard
    Jan 22 at 0:09






  • 1





    Algorithmically speaking, this is as efficient as it can get.

    – Olivier Melançon
    Jan 22 at 0:53






  • 2





    @spaniard Although, I think you have to handle the case where find and rfind will return -1

    – Olivier Melançon
    Jan 22 at 0:54






  • 2





    @OlivierMelançon right! thanks I will fix it.

    – spaniard
    Jan 22 at 1:04



















4














You can use min to find the closest space to the middle and then slice the string.



s = "The cat jumped over the moon very quickly."

mid = min((i for i, c in enumerate(s) if c == ' '), key=lambda i: abs(i - len(s) // 2))

fst, snd = s[:mid], s[mid+1:]

print(fst)
print(snd)


Output



The cat jumped over
the moon very quickly.





share|improve this answer
























  • Not to nitpick, but that min call consumes a generator (i.e., a loop)

    – sapi
    Jan 22 at 9:48



















2














I'd just split then rejoin:



text = "The cat jumped over the moon very quickly"
words = text.split()
first_half = " ".join(words[:len(words)//2])





share|improve this answer



















  • 1





    Depends on whether you want to split by # of words or overall string length.

    – Amber
    Jan 21 at 23:09






  • 1





    This split in equal amount of words, not characters

    – Olivier Melançon
    Jan 21 at 23:10



















2














I think the solutions using split are good. I tried to solve it without split and here's what I came up with.



sOdd = "The cat jumped over the moon very quickly."
sEven = "The cat jumped over the moon very quickly now."

def split_on_delim_mid(s, delim=" "):
delim_indexes = [
x[0] for x in enumerate(s) if x[1]==delim
] # [3, 7, 14, 19, 23, 28, 33]

# Select the correct number from delim_indexes
middle = len(delim_indexes)/2
if middle % 2 == 0:
middle_index = middle
else:
middle_index = (middle-.5)

# Return the separated sentances
sep = delim_indexes[int(middle_index)]
return s[:sep], s[sep:]

split_on_delim_mid(sOdd) # ('The cat jumped over', ' the moon very quickly.')
split_on_delim_mid(sEven) # ('The cat jumped over the', ' moon very quickly now.')


The idea here is to:




  • Find the indexes of the deliminator.

  • Find the median of that list of indexes

  • Split on that.






share|improve this answer































    1














    Solutions with split() and join() are fine if you want to get half the words, not half the string (counting the characters and not the words). I think the latter is impossibile without a for loop or a list comprehension (or an expensive workaround such a recursion to find the indexes of the spaces maybe).



    But if you are fine with a list comprehension, you could do:



    phrase = "The cat jumped over the moon very quickly."

    #indexes of separator, here the ' '
    sep_idxs = [i for i, j in enumerate(phrase) if j == ' ']

    #getting the separator index closer to half the length of the string
    sep = min(sep_idxs, key=lambda x:abs(x-(len(phrase) // 2)))

    first_half = phrase[:sep]
    last_half = phrase[sep+1:]

    print([first_half, last_half])


    Here first I look for the indexes of the separator with the list comprehension. Then I find the index of the closer separator to the half of the string using a custom key for the min() built-in function. Then split.



    The print statement prints ['The cat jumped over', 'the moon very quickly.']






    share|improve this answer































      0














      As Valentino says, the answer depends on whether you want to split the number of characters as evenly as possible or the number of words as evenly as possible: split()-based methods will do the latter.



      Here's a way to do the former without looping or list comprehension. delim can be any single character. This method just wouldn't work if you want a longer delimiter, since in that case it needn't be wholly in the first half or wholly in the second half.



      def middlesplit(s,delim=" "):
      if delim not in s:
      return (s,)
      midpoint=(len(s)+1)//2
      left=s[:midpoint].rfind(delim)
      right=s[:midpoint-1:-1].rfind(delim)
      if right>left:
      return (s[:-right-1],s[-right:])
      else:
      return (s[:left],s[left+1:])


      The reason for using rfind() rather than find() is so that you can choose the larger result, making sure you avoid the -1 if only one side of your string contains delim.






      share|improve this answer























        Your Answer






        StackExchange.ifUsing("editor", function () {
        StackExchange.using("externalEditor", function () {
        StackExchange.using("snippets", function () {
        StackExchange.snippets.init();
        });
        });
        }, "code-snippets");

        StackExchange.ready(function() {
        var channelOptions = {
        tags: "".split(" "),
        id: "1"
        };
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function() {
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled) {
        StackExchange.using("snippets", function() {
        createEditor();
        });
        }
        else {
        createEditor();
        }
        });

        function createEditor() {
        StackExchange.prepareEditor({
        heartbeatType: 'answer',
        autoActivateHeartbeat: false,
        convertImagesToLinks: true,
        noModals: true,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: 10,
        bindNavPrevention: true,
        postfix: "",
        imageUploader: {
        brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
        contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
        allowUrls: true
        },
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        });


        }
        });














        draft saved

        draft discarded


















        StackExchange.ready(
        function () {
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54298939%2fsplitting-a-python-string-at-a-delimiter-but-a-specific-one%23new-answer', 'question_page');
        }
        );

        Post as a guest















        Required, but never shown

























        7 Answers
        7






        active

        oldest

        votes








        7 Answers
        7






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes









        1














        how about something like this:



        s = "The cat jumped over the moon very quickly"

        l = s.split()

        s1 = ' '.join(l[:len(l)//2])
        s2 = ' '.join(l[len(l)//2 :])

        print(s1)
        print(s2)





        share|improve this answer
























        • This method will collapse consecutive spaces into a single space. "First␣sentence.␣␣And␣then␣the␣second." will split the string into "First␣sentence.␣And" and "then␣the␣second.". Notice the collapsed double space after the first sentence. Tabs t and newlines n will also be converted to a single space when joined. Using s.split(" ") will split on every individual space, which would maintain consecutive spaces when joined, but you then have a problem when halving the split string.

          – Billy Brown
          Jan 22 at 8:54











        • This also breaks down for "sentence with looooooooooooooooooong word and many very short ones"

          – tobias_k
          Jan 22 at 9:35
















        1














        how about something like this:



        s = "The cat jumped over the moon very quickly"

        l = s.split()

        s1 = ' '.join(l[:len(l)//2])
        s2 = ' '.join(l[len(l)//2 :])

        print(s1)
        print(s2)





        share|improve this answer
























        • This method will collapse consecutive spaces into a single space. "First␣sentence.␣␣And␣then␣the␣second." will split the string into "First␣sentence.␣And" and "then␣the␣second.". Notice the collapsed double space after the first sentence. Tabs t and newlines n will also be converted to a single space when joined. Using s.split(" ") will split on every individual space, which would maintain consecutive spaces when joined, but you then have a problem when halving the split string.

          – Billy Brown
          Jan 22 at 8:54











        • This also breaks down for "sentence with looooooooooooooooooong word and many very short ones"

          – tobias_k
          Jan 22 at 9:35














        1












        1








        1







        how about something like this:



        s = "The cat jumped over the moon very quickly"

        l = s.split()

        s1 = ' '.join(l[:len(l)//2])
        s2 = ' '.join(l[len(l)//2 :])

        print(s1)
        print(s2)





        share|improve this answer













        how about something like this:



        s = "The cat jumped over the moon very quickly"

        l = s.split()

        s1 = ' '.join(l[:len(l)//2])
        s2 = ' '.join(l[len(l)//2 :])

        print(s1)
        print(s2)






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Jan 21 at 23:10









        LonelyDaoistLonelyDaoist

        1247




        1247













        • This method will collapse consecutive spaces into a single space. "First␣sentence.␣␣And␣then␣the␣second." will split the string into "First␣sentence.␣And" and "then␣the␣second.". Notice the collapsed double space after the first sentence. Tabs t and newlines n will also be converted to a single space when joined. Using s.split(" ") will split on every individual space, which would maintain consecutive spaces when joined, but you then have a problem when halving the split string.

          – Billy Brown
          Jan 22 at 8:54











        • This also breaks down for "sentence with looooooooooooooooooong word and many very short ones"

          – tobias_k
          Jan 22 at 9:35



















        • This method will collapse consecutive spaces into a single space. "First␣sentence.␣␣And␣then␣the␣second." will split the string into "First␣sentence.␣And" and "then␣the␣second.". Notice the collapsed double space after the first sentence. Tabs t and newlines n will also be converted to a single space when joined. Using s.split(" ") will split on every individual space, which would maintain consecutive spaces when joined, but you then have a problem when halving the split string.

          – Billy Brown
          Jan 22 at 8:54











        • This also breaks down for "sentence with looooooooooooooooooong word and many very short ones"

          – tobias_k
          Jan 22 at 9:35

















        This method will collapse consecutive spaces into a single space. "First␣sentence.␣␣And␣then␣the␣second." will split the string into "First␣sentence.␣And" and "then␣the␣second.". Notice the collapsed double space after the first sentence. Tabs t and newlines n will also be converted to a single space when joined. Using s.split(" ") will split on every individual space, which would maintain consecutive spaces when joined, but you then have a problem when halving the split string.

        – Billy Brown
        Jan 22 at 8:54





        This method will collapse consecutive spaces into a single space. "First␣sentence.␣␣And␣then␣the␣second." will split the string into "First␣sentence.␣And" and "then␣the␣second.". Notice the collapsed double space after the first sentence. Tabs t and newlines n will also be converted to a single space when joined. Using s.split(" ") will split on every individual space, which would maintain consecutive spaces when joined, but you then have a problem when halving the split string.

        – Billy Brown
        Jan 22 at 8:54













        This also breaks down for "sentence with looooooooooooooooooong word and many very short ones"

        – tobias_k
        Jan 22 at 9:35





        This also breaks down for "sentence with looooooooooooooooooong word and many very short ones"

        – tobias_k
        Jan 22 at 9:35













        5














        This should work:



        def split_text(text):
        middle = len(text)//2
        under = text.rfind(" ", 0, middle)
        over = text.find(" ", middle)
        if over > under and under != -1:
        return (text[:,middle - under], text[middle - under,:])
        else:
        if over is -1:
        raise ValueError("No separator found in text '{}'".format(text))
        return (text[:,middle + over], text[middle + over,:])


        it does not use a for loop, but probably using a for loop would have better performance.



        I handle the case where the separator is not found in the whole string by raising an error, but change raise ValueError() for whatever way you want to handle that case.






        share|improve this answer





















        • 2





          You mean under = text.rfind(" ", 0, middle).

          – CristiFati
          Jan 22 at 0:05








        • 1





          Right, I will edit it.

          – spaniard
          Jan 22 at 0:09






        • 1





          Algorithmically speaking, this is as efficient as it can get.

          – Olivier Melançon
          Jan 22 at 0:53






        • 2





          @spaniard Although, I think you have to handle the case where find and rfind will return -1

          – Olivier Melançon
          Jan 22 at 0:54






        • 2





          @OlivierMelançon right! thanks I will fix it.

          – spaniard
          Jan 22 at 1:04
















        5














        This should work:



        def split_text(text):
        middle = len(text)//2
        under = text.rfind(" ", 0, middle)
        over = text.find(" ", middle)
        if over > under and under != -1:
        return (text[:,middle - under], text[middle - under,:])
        else:
        if over is -1:
        raise ValueError("No separator found in text '{}'".format(text))
        return (text[:,middle + over], text[middle + over,:])


        it does not use a for loop, but probably using a for loop would have better performance.



        I handle the case where the separator is not found in the whole string by raising an error, but change raise ValueError() for whatever way you want to handle that case.






        share|improve this answer





















        • 2





          You mean under = text.rfind(" ", 0, middle).

          – CristiFati
          Jan 22 at 0:05








        • 1





          Right, I will edit it.

          – spaniard
          Jan 22 at 0:09






        • 1





          Algorithmically speaking, this is as efficient as it can get.

          – Olivier Melançon
          Jan 22 at 0:53






        • 2





          @spaniard Although, I think you have to handle the case where find and rfind will return -1

          – Olivier Melançon
          Jan 22 at 0:54






        • 2





          @OlivierMelançon right! thanks I will fix it.

          – spaniard
          Jan 22 at 1:04














        5












        5








        5







        This should work:



        def split_text(text):
        middle = len(text)//2
        under = text.rfind(" ", 0, middle)
        over = text.find(" ", middle)
        if over > under and under != -1:
        return (text[:,middle - under], text[middle - under,:])
        else:
        if over is -1:
        raise ValueError("No separator found in text '{}'".format(text))
        return (text[:,middle + over], text[middle + over,:])


        it does not use a for loop, but probably using a for loop would have better performance.



        I handle the case where the separator is not found in the whole string by raising an error, but change raise ValueError() for whatever way you want to handle that case.






        share|improve this answer















        This should work:



        def split_text(text):
        middle = len(text)//2
        under = text.rfind(" ", 0, middle)
        over = text.find(" ", middle)
        if over > under and under != -1:
        return (text[:,middle - under], text[middle - under,:])
        else:
        if over is -1:
        raise ValueError("No separator found in text '{}'".format(text))
        return (text[:,middle + over], text[middle + over,:])


        it does not use a for loop, but probably using a for loop would have better performance.



        I handle the case where the separator is not found in the whole string by raising an error, but change raise ValueError() for whatever way you want to handle that case.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Jan 22 at 1:18









        Olivier Melançon

        13k11940




        13k11940










        answered Jan 22 at 0:02









        spaniardspaniard

        46729




        46729








        • 2





          You mean under = text.rfind(" ", 0, middle).

          – CristiFati
          Jan 22 at 0:05








        • 1





          Right, I will edit it.

          – spaniard
          Jan 22 at 0:09






        • 1





          Algorithmically speaking, this is as efficient as it can get.

          – Olivier Melançon
          Jan 22 at 0:53






        • 2





          @spaniard Although, I think you have to handle the case where find and rfind will return -1

          – Olivier Melançon
          Jan 22 at 0:54






        • 2





          @OlivierMelançon right! thanks I will fix it.

          – spaniard
          Jan 22 at 1:04














        • 2





          You mean under = text.rfind(" ", 0, middle).

          – CristiFati
          Jan 22 at 0:05








        • 1





          Right, I will edit it.

          – spaniard
          Jan 22 at 0:09






        • 1





          Algorithmically speaking, this is as efficient as it can get.

          – Olivier Melançon
          Jan 22 at 0:53






        • 2





          @spaniard Although, I think you have to handle the case where find and rfind will return -1

          – Olivier Melançon
          Jan 22 at 0:54






        • 2





          @OlivierMelançon right! thanks I will fix it.

          – spaniard
          Jan 22 at 1:04








        2




        2





        You mean under = text.rfind(" ", 0, middle).

        – CristiFati
        Jan 22 at 0:05







        You mean under = text.rfind(" ", 0, middle).

        – CristiFati
        Jan 22 at 0:05






        1




        1





        Right, I will edit it.

        – spaniard
        Jan 22 at 0:09





        Right, I will edit it.

        – spaniard
        Jan 22 at 0:09




        1




        1





        Algorithmically speaking, this is as efficient as it can get.

        – Olivier Melançon
        Jan 22 at 0:53





        Algorithmically speaking, this is as efficient as it can get.

        – Olivier Melançon
        Jan 22 at 0:53




        2




        2





        @spaniard Although, I think you have to handle the case where find and rfind will return -1

        – Olivier Melançon
        Jan 22 at 0:54





        @spaniard Although, I think you have to handle the case where find and rfind will return -1

        – Olivier Melançon
        Jan 22 at 0:54




        2




        2





        @OlivierMelançon right! thanks I will fix it.

        – spaniard
        Jan 22 at 1:04





        @OlivierMelançon right! thanks I will fix it.

        – spaniard
        Jan 22 at 1:04











        4














        You can use min to find the closest space to the middle and then slice the string.



        s = "The cat jumped over the moon very quickly."

        mid = min((i for i, c in enumerate(s) if c == ' '), key=lambda i: abs(i - len(s) // 2))

        fst, snd = s[:mid], s[mid+1:]

        print(fst)
        print(snd)


        Output



        The cat jumped over
        the moon very quickly.





        share|improve this answer
























        • Not to nitpick, but that min call consumes a generator (i.e., a loop)

          – sapi
          Jan 22 at 9:48
















        4














        You can use min to find the closest space to the middle and then slice the string.



        s = "The cat jumped over the moon very quickly."

        mid = min((i for i, c in enumerate(s) if c == ' '), key=lambda i: abs(i - len(s) // 2))

        fst, snd = s[:mid], s[mid+1:]

        print(fst)
        print(snd)


        Output



        The cat jumped over
        the moon very quickly.





        share|improve this answer
























        • Not to nitpick, but that min call consumes a generator (i.e., a loop)

          – sapi
          Jan 22 at 9:48














        4












        4








        4







        You can use min to find the closest space to the middle and then slice the string.



        s = "The cat jumped over the moon very quickly."

        mid = min((i for i, c in enumerate(s) if c == ' '), key=lambda i: abs(i - len(s) // 2))

        fst, snd = s[:mid], s[mid+1:]

        print(fst)
        print(snd)


        Output



        The cat jumped over
        the moon very quickly.





        share|improve this answer













        You can use min to find the closest space to the middle and then slice the string.



        s = "The cat jumped over the moon very quickly."

        mid = min((i for i, c in enumerate(s) if c == ' '), key=lambda i: abs(i - len(s) // 2))

        fst, snd = s[:mid], s[mid+1:]

        print(fst)
        print(snd)


        Output



        The cat jumped over
        the moon very quickly.






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Jan 22 at 0:48









        Olivier MelançonOlivier Melançon

        13k11940




        13k11940













        • Not to nitpick, but that min call consumes a generator (i.e., a loop)

          – sapi
          Jan 22 at 9:48



















        • Not to nitpick, but that min call consumes a generator (i.e., a loop)

          – sapi
          Jan 22 at 9:48

















        Not to nitpick, but that min call consumes a generator (i.e., a loop)

        – sapi
        Jan 22 at 9:48





        Not to nitpick, but that min call consumes a generator (i.e., a loop)

        – sapi
        Jan 22 at 9:48











        2














        I'd just split then rejoin:



        text = "The cat jumped over the moon very quickly"
        words = text.split()
        first_half = " ".join(words[:len(words)//2])





        share|improve this answer



















        • 1





          Depends on whether you want to split by # of words or overall string length.

          – Amber
          Jan 21 at 23:09






        • 1





          This split in equal amount of words, not characters

          – Olivier Melançon
          Jan 21 at 23:10
















        2














        I'd just split then rejoin:



        text = "The cat jumped over the moon very quickly"
        words = text.split()
        first_half = " ".join(words[:len(words)//2])





        share|improve this answer



















        • 1





          Depends on whether you want to split by # of words or overall string length.

          – Amber
          Jan 21 at 23:09






        • 1





          This split in equal amount of words, not characters

          – Olivier Melançon
          Jan 21 at 23:10














        2












        2








        2







        I'd just split then rejoin:



        text = "The cat jumped over the moon very quickly"
        words = text.split()
        first_half = " ".join(words[:len(words)//2])





        share|improve this answer













        I'd just split then rejoin:



        text = "The cat jumped over the moon very quickly"
        words = text.split()
        first_half = " ".join(words[:len(words)//2])






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Jan 21 at 23:08









        Joe HalliwellJoe Halliwell

        627317




        627317








        • 1





          Depends on whether you want to split by # of words or overall string length.

          – Amber
          Jan 21 at 23:09






        • 1





          This split in equal amount of words, not characters

          – Olivier Melançon
          Jan 21 at 23:10














        • 1





          Depends on whether you want to split by # of words or overall string length.

          – Amber
          Jan 21 at 23:09






        • 1





          This split in equal amount of words, not characters

          – Olivier Melançon
          Jan 21 at 23:10








        1




        1





        Depends on whether you want to split by # of words or overall string length.

        – Amber
        Jan 21 at 23:09





        Depends on whether you want to split by # of words or overall string length.

        – Amber
        Jan 21 at 23:09




        1




        1





        This split in equal amount of words, not characters

        – Olivier Melançon
        Jan 21 at 23:10





        This split in equal amount of words, not characters

        – Olivier Melançon
        Jan 21 at 23:10











        2














        I think the solutions using split are good. I tried to solve it without split and here's what I came up with.



        sOdd = "The cat jumped over the moon very quickly."
        sEven = "The cat jumped over the moon very quickly now."

        def split_on_delim_mid(s, delim=" "):
        delim_indexes = [
        x[0] for x in enumerate(s) if x[1]==delim
        ] # [3, 7, 14, 19, 23, 28, 33]

        # Select the correct number from delim_indexes
        middle = len(delim_indexes)/2
        if middle % 2 == 0:
        middle_index = middle
        else:
        middle_index = (middle-.5)

        # Return the separated sentances
        sep = delim_indexes[int(middle_index)]
        return s[:sep], s[sep:]

        split_on_delim_mid(sOdd) # ('The cat jumped over', ' the moon very quickly.')
        split_on_delim_mid(sEven) # ('The cat jumped over the', ' moon very quickly now.')


        The idea here is to:




        • Find the indexes of the deliminator.

        • Find the median of that list of indexes

        • Split on that.






        share|improve this answer




























          2














          I think the solutions using split are good. I tried to solve it without split and here's what I came up with.



          sOdd = "The cat jumped over the moon very quickly."
          sEven = "The cat jumped over the moon very quickly now."

          def split_on_delim_mid(s, delim=" "):
          delim_indexes = [
          x[0] for x in enumerate(s) if x[1]==delim
          ] # [3, 7, 14, 19, 23, 28, 33]

          # Select the correct number from delim_indexes
          middle = len(delim_indexes)/2
          if middle % 2 == 0:
          middle_index = middle
          else:
          middle_index = (middle-.5)

          # Return the separated sentances
          sep = delim_indexes[int(middle_index)]
          return s[:sep], s[sep:]

          split_on_delim_mid(sOdd) # ('The cat jumped over', ' the moon very quickly.')
          split_on_delim_mid(sEven) # ('The cat jumped over the', ' moon very quickly now.')


          The idea here is to:




          • Find the indexes of the deliminator.

          • Find the median of that list of indexes

          • Split on that.






          share|improve this answer


























            2












            2








            2







            I think the solutions using split are good. I tried to solve it without split and here's what I came up with.



            sOdd = "The cat jumped over the moon very quickly."
            sEven = "The cat jumped over the moon very quickly now."

            def split_on_delim_mid(s, delim=" "):
            delim_indexes = [
            x[0] for x in enumerate(s) if x[1]==delim
            ] # [3, 7, 14, 19, 23, 28, 33]

            # Select the correct number from delim_indexes
            middle = len(delim_indexes)/2
            if middle % 2 == 0:
            middle_index = middle
            else:
            middle_index = (middle-.5)

            # Return the separated sentances
            sep = delim_indexes[int(middle_index)]
            return s[:sep], s[sep:]

            split_on_delim_mid(sOdd) # ('The cat jumped over', ' the moon very quickly.')
            split_on_delim_mid(sEven) # ('The cat jumped over the', ' moon very quickly now.')


            The idea here is to:




            • Find the indexes of the deliminator.

            • Find the median of that list of indexes

            • Split on that.






            share|improve this answer













            I think the solutions using split are good. I tried to solve it without split and here's what I came up with.



            sOdd = "The cat jumped over the moon very quickly."
            sEven = "The cat jumped over the moon very quickly now."

            def split_on_delim_mid(s, delim=" "):
            delim_indexes = [
            x[0] for x in enumerate(s) if x[1]==delim
            ] # [3, 7, 14, 19, 23, 28, 33]

            # Select the correct number from delim_indexes
            middle = len(delim_indexes)/2
            if middle % 2 == 0:
            middle_index = middle
            else:
            middle_index = (middle-.5)

            # Return the separated sentances
            sep = delim_indexes[int(middle_index)]
            return s[:sep], s[sep:]

            split_on_delim_mid(sOdd) # ('The cat jumped over', ' the moon very quickly.')
            split_on_delim_mid(sEven) # ('The cat jumped over the', ' moon very quickly now.')


            The idea here is to:




            • Find the indexes of the deliminator.

            • Find the median of that list of indexes

            • Split on that.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Jan 21 at 23:19









            Charles LandauCharles Landau

            2,3781216




            2,3781216























                1














                Solutions with split() and join() are fine if you want to get half the words, not half the string (counting the characters and not the words). I think the latter is impossibile without a for loop or a list comprehension (or an expensive workaround such a recursion to find the indexes of the spaces maybe).



                But if you are fine with a list comprehension, you could do:



                phrase = "The cat jumped over the moon very quickly."

                #indexes of separator, here the ' '
                sep_idxs = [i for i, j in enumerate(phrase) if j == ' ']

                #getting the separator index closer to half the length of the string
                sep = min(sep_idxs, key=lambda x:abs(x-(len(phrase) // 2)))

                first_half = phrase[:sep]
                last_half = phrase[sep+1:]

                print([first_half, last_half])


                Here first I look for the indexes of the separator with the list comprehension. Then I find the index of the closer separator to the half of the string using a custom key for the min() built-in function. Then split.



                The print statement prints ['The cat jumped over', 'the moon very quickly.']






                share|improve this answer




























                  1














                  Solutions with split() and join() are fine if you want to get half the words, not half the string (counting the characters and not the words). I think the latter is impossibile without a for loop or a list comprehension (or an expensive workaround such a recursion to find the indexes of the spaces maybe).



                  But if you are fine with a list comprehension, you could do:



                  phrase = "The cat jumped over the moon very quickly."

                  #indexes of separator, here the ' '
                  sep_idxs = [i for i, j in enumerate(phrase) if j == ' ']

                  #getting the separator index closer to half the length of the string
                  sep = min(sep_idxs, key=lambda x:abs(x-(len(phrase) // 2)))

                  first_half = phrase[:sep]
                  last_half = phrase[sep+1:]

                  print([first_half, last_half])


                  Here first I look for the indexes of the separator with the list comprehension. Then I find the index of the closer separator to the half of the string using a custom key for the min() built-in function. Then split.



                  The print statement prints ['The cat jumped over', 'the moon very quickly.']






                  share|improve this answer


























                    1












                    1








                    1







                    Solutions with split() and join() are fine if you want to get half the words, not half the string (counting the characters and not the words). I think the latter is impossibile without a for loop or a list comprehension (or an expensive workaround such a recursion to find the indexes of the spaces maybe).



                    But if you are fine with a list comprehension, you could do:



                    phrase = "The cat jumped over the moon very quickly."

                    #indexes of separator, here the ' '
                    sep_idxs = [i for i, j in enumerate(phrase) if j == ' ']

                    #getting the separator index closer to half the length of the string
                    sep = min(sep_idxs, key=lambda x:abs(x-(len(phrase) // 2)))

                    first_half = phrase[:sep]
                    last_half = phrase[sep+1:]

                    print([first_half, last_half])


                    Here first I look for the indexes of the separator with the list comprehension. Then I find the index of the closer separator to the half of the string using a custom key for the min() built-in function. Then split.



                    The print statement prints ['The cat jumped over', 'the moon very quickly.']






                    share|improve this answer













                    Solutions with split() and join() are fine if you want to get half the words, not half the string (counting the characters and not the words). I think the latter is impossibile without a for loop or a list comprehension (or an expensive workaround such a recursion to find the indexes of the spaces maybe).



                    But if you are fine with a list comprehension, you could do:



                    phrase = "The cat jumped over the moon very quickly."

                    #indexes of separator, here the ' '
                    sep_idxs = [i for i, j in enumerate(phrase) if j == ' ']

                    #getting the separator index closer to half the length of the string
                    sep = min(sep_idxs, key=lambda x:abs(x-(len(phrase) // 2)))

                    first_half = phrase[:sep]
                    last_half = phrase[sep+1:]

                    print([first_half, last_half])


                    Here first I look for the indexes of the separator with the list comprehension. Then I find the index of the closer separator to the half of the string using a custom key for the min() built-in function. Then split.



                    The print statement prints ['The cat jumped over', 'the moon very quickly.']







                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered Jan 21 at 23:57









                    ValentinoValentino

                    467129




                    467129























                        0














                        As Valentino says, the answer depends on whether you want to split the number of characters as evenly as possible or the number of words as evenly as possible: split()-based methods will do the latter.



                        Here's a way to do the former without looping or list comprehension. delim can be any single character. This method just wouldn't work if you want a longer delimiter, since in that case it needn't be wholly in the first half or wholly in the second half.



                        def middlesplit(s,delim=" "):
                        if delim not in s:
                        return (s,)
                        midpoint=(len(s)+1)//2
                        left=s[:midpoint].rfind(delim)
                        right=s[:midpoint-1:-1].rfind(delim)
                        if right>left:
                        return (s[:-right-1],s[-right:])
                        else:
                        return (s[:left],s[left+1:])


                        The reason for using rfind() rather than find() is so that you can choose the larger result, making sure you avoid the -1 if only one side of your string contains delim.






                        share|improve this answer




























                          0














                          As Valentino says, the answer depends on whether you want to split the number of characters as evenly as possible or the number of words as evenly as possible: split()-based methods will do the latter.



                          Here's a way to do the former without looping or list comprehension. delim can be any single character. This method just wouldn't work if you want a longer delimiter, since in that case it needn't be wholly in the first half or wholly in the second half.



                          def middlesplit(s,delim=" "):
                          if delim not in s:
                          return (s,)
                          midpoint=(len(s)+1)//2
                          left=s[:midpoint].rfind(delim)
                          right=s[:midpoint-1:-1].rfind(delim)
                          if right>left:
                          return (s[:-right-1],s[-right:])
                          else:
                          return (s[:left],s[left+1:])


                          The reason for using rfind() rather than find() is so that you can choose the larger result, making sure you avoid the -1 if only one side of your string contains delim.






                          share|improve this answer


























                            0












                            0








                            0







                            As Valentino says, the answer depends on whether you want to split the number of characters as evenly as possible or the number of words as evenly as possible: split()-based methods will do the latter.



                            Here's a way to do the former without looping or list comprehension. delim can be any single character. This method just wouldn't work if you want a longer delimiter, since in that case it needn't be wholly in the first half or wholly in the second half.



                            def middlesplit(s,delim=" "):
                            if delim not in s:
                            return (s,)
                            midpoint=(len(s)+1)//2
                            left=s[:midpoint].rfind(delim)
                            right=s[:midpoint-1:-1].rfind(delim)
                            if right>left:
                            return (s[:-right-1],s[-right:])
                            else:
                            return (s[:left],s[left+1:])


                            The reason for using rfind() rather than find() is so that you can choose the larger result, making sure you avoid the -1 if only one side of your string contains delim.






                            share|improve this answer













                            As Valentino says, the answer depends on whether you want to split the number of characters as evenly as possible or the number of words as evenly as possible: split()-based methods will do the latter.



                            Here's a way to do the former without looping or list comprehension. delim can be any single character. This method just wouldn't work if you want a longer delimiter, since in that case it needn't be wholly in the first half or wholly in the second half.



                            def middlesplit(s,delim=" "):
                            if delim not in s:
                            return (s,)
                            midpoint=(len(s)+1)//2
                            left=s[:midpoint].rfind(delim)
                            right=s[:midpoint-1:-1].rfind(delim)
                            if right>left:
                            return (s[:-right-1],s[-right:])
                            else:
                            return (s[:left],s[left+1:])


                            The reason for using rfind() rather than find() is so that you can choose the larger result, making sure you avoid the -1 if only one side of your string contains delim.







                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Jan 22 at 10:00









                            Especially LimeEspecially Lime

                            1112




                            1112






























                                draft saved

                                draft discarded




















































                                Thanks for contributing an answer to Stack Overflow!


                                • Please be sure to answer the question. Provide details and share your research!

                                But avoid



                                • Asking for help, clarification, or responding to other answers.

                                • Making statements based on opinion; back them up with references or personal experience.


                                To learn more, see our tips on writing great answers.




                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function () {
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54298939%2fsplitting-a-python-string-at-a-delimiter-but-a-specific-one%23new-answer', 'question_page');
                                }
                                );

                                Post as a guest















                                Required, but never shown





















































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown

































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown







                                Popular posts from this blog

                                How to change which sound is reproduced for terminal bell?

                                Can I use Tabulator js library in my java Spring + Thymeleaf project?

                                Title Spacing in Bjornstrup Chapter, Removing Chapter Number From Contents