Splitting a python string at a delimiter but a specific one

Is there a way to split a python string without using a for loop that basically splits a string in the middle to the closest delimiter.

Like:

The cat jumped over the moon very quickly.

The delimiter would be the space and the resulting strings would be:

The cat jumped over

the moon very quickly.

I see there is a count where I can see how many spaces are in there (Don't see how to return their indexes though). I could then find the middle one by dividing by two, but then how to say split on this delimiter at this index. Find is close but it returns the first index (or right first index using rfind) not all the indexes where " " is found. I might be over thinking this.

asked Jan 21 at 23:01

Codejoy

1,30183868

3

What about using split() and the re-joining the first and second half of the resulting list separately?

– Feodoran
Jan 21 at 23:07

2

The algorithm you define (counting the spaces) would split the sentence into equal number of words, which conflicts with your requirement (split a string in the middle to the closest delimiter). Which one are you after?

– Selcuk
Jan 21 at 23:19

add a comment |

Is there a way to split a python string without using a for loop that basically splits a string in the middle to the closest delimiter.

Like:

The cat jumped over the moon very quickly.

The delimiter would be the space and the resulting strings would be:

The cat jumped over

the moon very quickly.

asked Jan 21 at 23:01

Codejoy

1,30183868

3

What about using split() and the re-joining the first and second half of the resulting list separately?

– Feodoran
Jan 21 at 23:07

2

The algorithm you define (counting the spaces) would split the sentence into equal number of words, which conflicts with your requirement (split a string in the middle to the closest delimiter). Which one are you after?

– Selcuk
Jan 21 at 23:19

add a comment |

Is there a way to split a python string without using a for loop that basically splits a string in the middle to the closest delimiter.

Like:

The cat jumped over the moon very quickly.

The delimiter would be the space and the resulting strings would be:

The cat jumped over

the moon very quickly.

asked Jan 21 at 23:01

Codejoy

1,30183868

Is there a way to split a python string without using a for loop that basically splits a string in the middle to the closest delimiter.

Like:

The cat jumped over the moon very quickly.

The delimiter would be the space and the resulting strings would be:

The cat jumped over

the moon very quickly.

python

asked Jan 21 at 23:01

Codejoy

1,30183868

asked Jan 21 at 23:01

Codejoy

1,30183868

asked Jan 21 at 23:01

Codejoy

1,30183868

asked Jan 21 at 23:01

Codejoy

1,30183868

asked Jan 21 at 23:01

Codejoy

1,30183868

3

What about using split() and the re-joining the first and second half of the resulting list separately?

– Feodoran
Jan 21 at 23:07

2

The algorithm you define (counting the spaces) would split the sentence into equal number of words, which conflicts with your requirement (split a string in the middle to the closest delimiter). Which one are you after?

– Selcuk
Jan 21 at 23:19

add a comment |

3

What about using split() and the re-joining the first and second half of the resulting list separately?

– Feodoran
Jan 21 at 23:07

2

The algorithm you define (counting the spaces) would split the sentence into equal number of words, which conflicts with your requirement (split a string in the middle to the closest delimiter). Which one are you after?

– Selcuk
Jan 21 at 23:19

What about using split() and the re-joining the first and second half of the resulting list separately?

– Feodoran
Jan 21 at 23:07

The algorithm you define (counting the spaces) would split the sentence into equal number of words, which conflicts with your requirement (split a string in the middle to the closest delimiter). Which one are you after?

– Selcuk
Jan 21 at 23:19

add a comment |

7 Answers
7

active

oldest

votes

how about something like this:

s = "The cat jumped over the moon very quickly"



l = s.split()



s1 = ' '.join(l[:len(l)//2])

s2 = ' '.join(l[len(l)//2 :])



print(s1)

print(s2)

answered Jan 21 at 23:10

LonelyDaoist

1247

This method will collapse consecutive spaces into a single space. "First␣sentence.␣␣And␣then␣the␣second." will split the string into "First␣sentence.␣And" and "then␣the␣second.". Notice the collapsed double space after the first sentence. Tabs t and newlines n will also be converted to a single space when joined. Using s.split(" ") will split on every individual space, which would maintain consecutive spaces when joined, but you then have a problem when halving the split string.

– Billy Brown
Jan 22 at 8:54

This also breaks down for "sentence with looooooooooooooooooong word and many very short ones"

– tobias_k
Jan 22 at 9:35

add a comment |

This should work:

def split_text(text):

    middle = len(text)//2

    under = text.rfind(" ", 0, middle)

    over = text.find(" ", middle)

    if over > under and under != -1:

        return (text[:,middle - under], text[middle - under,:])

    else:

        if over is -1:

              raise ValueError("No separator found in text '{}'".format(text))

        return (text[:,middle + over], text[middle + over,:])

it does not use a for loop, but probably using a for loop would have better performance.

I handle the case where the separator is not found in the whole string by raising an error, but change raise ValueError() for whatever way you want to handle that case.

edited Jan 22 at 1:18

Olivier Melançon

13k11940

answered Jan 22 at 0:02

spaniard

46729

2

You mean under = text.rfind(" ", 0, middle).

– CristiFati
Jan 22 at 0:05

1

Right, I will edit it.

– spaniard
Jan 22 at 0:09

1

Algorithmically speaking, this is as efficient as it can get.

– Olivier Melançon
Jan 22 at 0:53

2

@spaniard Although, I think you have to handle the case where find and rfind will return -1

– Olivier Melançon
Jan 22 at 0:54

2

@OlivierMelançon right! thanks I will fix it.

– spaniard
Jan 22 at 1:04

add a comment |

You can use min to find the closest space to the middle and then slice the string.

s = "The cat jumped over the moon very quickly."



mid = min((i for i, c in enumerate(s) if c == ' '), key=lambda i: abs(i - len(s) // 2))



fst, snd = s[:mid], s[mid+1:]



print(fst)

print(snd)

Output

The cat jumped over

the moon very quickly.

answered Jan 22 at 0:48

Olivier Melançon

13k11940

Not to nitpick, but that min call consumes a generator (i.e., a loop)

– sapi
Jan 22 at 9:48

add a comment |

I'd just split then rejoin:

text = "The cat jumped over the moon very quickly"

words = text.split()

first_half = " ".join(words[:len(words)//2])

answered Jan 21 at 23:08

Joe Halliwell

627317

1

Depends on whether you want to split by # of words or overall string length.

– Amber
Jan 21 at 23:09

1

This split in equal amount of words, not characters

– Olivier Melançon
Jan 21 at 23:10

add a comment |

I think the solutions using split are good. I tried to solve it without split and here's what I came up with.

sOdd = "The cat jumped over the moon very quickly."

sEven = "The cat jumped over the moon very quickly now."



def split_on_delim_mid(s, delim=" "):

  delim_indexes = [

      x[0] for x in enumerate(s) if x[1]==delim

  ] # [3, 7, 14, 19, 23, 28, 33]



  # Select the correct number from delim_indexes

  middle = len(delim_indexes)/2

  if middle % 2 == 0:

    middle_index = middle

  else:

    middle_index = (middle-.5)



  # Return the separated sentances

  sep = delim_indexes[int(middle_index)]

  return s[:sep], s[sep:]



split_on_delim_mid(sOdd) # ('The cat jumped over', ' the moon very quickly.')

split_on_delim_mid(sEven) # ('The cat jumped over the', ' moon very quickly now.')

The idea here is to:

Find the indexes of the deliminator.

Find the median of that list of indexes

Split on that.

answered Jan 21 at 23:19

Charles Landau

2,3781216

add a comment |

Solutions with split() and join() are fine if you want to get half the words, not half the string (counting the characters and not the words). I think the latter is impossibile without a for loop or a list comprehension (or an expensive workaround such a recursion to find the indexes of the spaces maybe).

But if you are fine with a list comprehension, you could do:

phrase = "The cat jumped over the moon very quickly."



#indexes of separator, here the ' '

sep_idxs = [i for i, j in enumerate(phrase) if j == ' ']



#getting the separator index closer to half the length of the string

sep = min(sep_idxs, key=lambda x:abs(x-(len(phrase) // 2)))



first_half = phrase[:sep]

last_half = phrase[sep+1:]



print([first_half, last_half])

Here first I look for the indexes of the separator with the list comprehension. Then I find the index of the closer separator to the half of the string using a custom key for the min() built-in function. Then split.

The print statement prints ['The cat jumped over', 'the moon very quickly.']

answered Jan 21 at 23:57

Valentino

467129

add a comment |

As Valentino says, the answer depends on whether you want to split the number of characters as evenly as possible or the number of words as evenly as possible: split()-based methods will do the latter.

Here's a way to do the former without looping or list comprehension. delim can be any single character. This method just wouldn't work if you want a longer delimiter, since in that case it needn't be wholly in the first half or wholly in the second half.

def middlesplit(s,delim=" "):

    if delim not in s:

        return (s,)

    midpoint=(len(s)+1)//2

    left=s[:midpoint].rfind(delim)

    right=s[:midpoint-1:-1].rfind(delim)    

    if right>left:

        return (s[:-right-1],s[-right:])

    else:

        return (s[:left],s[left+1:])

The reason for using rfind() rather than find() is so that you can choose the larger result, making sure you avoid the -1 if only one side of your string contains delim.

answered Jan 22 at 10:00

Especially Lime

1112

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54298939%2fsplitting-a-python-string-at-a-delimiter-but-a-specific-one%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

7 Answers
7

active

oldest

votes

7 Answers
7

active

oldest

votes

how about something like this:

s = "The cat jumped over the moon very quickly"



l = s.split()



s1 = ' '.join(l[:len(l)//2])

s2 = ' '.join(l[len(l)//2 :])



print(s1)

print(s2)

answered Jan 21 at 23:10

LonelyDaoist

1247

This method will collapse consecutive spaces into a single space. "First␣sentence.␣␣And␣then␣the␣second." will split the string into "First␣sentence.␣And" and "then␣the␣second.". Notice the collapsed double space after the first sentence. Tabs t and newlines n will also be converted to a single space when joined. Using s.split(" ") will split on every individual space, which would maintain consecutive spaces when joined, but you then have a problem when halving the split string.

– Billy Brown
Jan 22 at 8:54

This also breaks down for "sentence with looooooooooooooooooong word and many very short ones"

– tobias_k
Jan 22 at 9:35

add a comment |

how about something like this:

s = "The cat jumped over the moon very quickly"



l = s.split()



s1 = ' '.join(l[:len(l)//2])

s2 = ' '.join(l[len(l)//2 :])



print(s1)

print(s2)

answered Jan 21 at 23:10

LonelyDaoist

1247

This method will collapse consecutive spaces into a single space. "First␣sentence.␣␣And␣then␣the␣second." will split the string into "First␣sentence.␣And" and "then␣the␣second.". Notice the collapsed double space after the first sentence. Tabs t and newlines n will also be converted to a single space when joined. Using s.split(" ") will split on every individual space, which would maintain consecutive spaces when joined, but you then have a problem when halving the split string.

– Billy Brown
Jan 22 at 8:54

This also breaks down for "sentence with looooooooooooooooooong word and many very short ones"

– tobias_k
Jan 22 at 9:35

add a comment |

how about something like this:

s = "The cat jumped over the moon very quickly"



l = s.split()



s1 = ' '.join(l[:len(l)//2])

s2 = ' '.join(l[len(l)//2 :])



print(s1)

print(s2)

answered Jan 21 at 23:10

LonelyDaoist

1247

how about something like this:

s = "The cat jumped over the moon very quickly"



l = s.split()



s1 = ' '.join(l[:len(l)//2])

s2 = ' '.join(l[len(l)//2 :])



print(s1)

print(s2)

answered Jan 21 at 23:10

LonelyDaoist

1247

answered Jan 21 at 23:10

LonelyDaoist

1247

answered Jan 21 at 23:10

LonelyDaoist

1247

answered Jan 21 at 23:10

LonelyDaoist

1247

This method will collapse consecutive spaces into a single space. "First␣sentence.␣␣And␣then␣the␣second." will split the string into "First␣sentence.␣And" and "then␣the␣second.". Notice the collapsed double space after the first sentence. Tabs t and newlines n will also be converted to a single space when joined. Using s.split(" ") will split on every individual space, which would maintain consecutive spaces when joined, but you then have a problem when halving the split string.

– Billy Brown
Jan 22 at 8:54

This also breaks down for "sentence with looooooooooooooooooong word and many very short ones"

– tobias_k
Jan 22 at 9:35

add a comment |

This method will collapse consecutive spaces into a single space. "First␣sentence.␣␣And␣then␣the␣second." will split the string into "First␣sentence.␣And" and "then␣the␣second.". Notice the collapsed double space after the first sentence. Tabs t and newlines n will also be converted to a single space when joined. Using s.split(" ") will split on every individual space, which would maintain consecutive spaces when joined, but you then have a problem when halving the split string.

– Billy Brown
Jan 22 at 8:54

This also breaks down for "sentence with looooooooooooooooooong word and many very short ones"

– tobias_k
Jan 22 at 9:35

This method will collapse consecutive spaces into a single space. "First␣sentence.␣␣And␣then␣the␣second." will split the string into "First␣sentence.␣And" and "then␣the␣second.". Notice the collapsed double space after the first sentence. Tabs t and newlines n will also be converted to a single space when joined. Using s.split(" ") will split on every individual space, which would maintain consecutive spaces when joined, but you then have a problem when halving the split string.

– Billy Brown
Jan 22 at 8:54

This also breaks down for "sentence with looooooooooooooooooong word and many very short ones"

– tobias_k
Jan 22 at 9:35

add a comment |

This should work:

def split_text(text):

    middle = len(text)//2

    under = text.rfind(" ", 0, middle)

    over = text.find(" ", middle)

    if over > under and under != -1:

        return (text[:,middle - under], text[middle - under,:])

    else:

        if over is -1:

              raise ValueError("No separator found in text '{}'".format(text))

        return (text[:,middle + over], text[middle + over,:])

it does not use a for loop, but probably using a for loop would have better performance.

I handle the case where the separator is not found in the whole string by raising an error, but change raise ValueError() for whatever way you want to handle that case.

edited Jan 22 at 1:18

Olivier Melançon

13k11940

answered Jan 22 at 0:02

spaniard

46729

2

You mean under = text.rfind(" ", 0, middle).

– CristiFati
Jan 22 at 0:05

1

Right, I will edit it.

– spaniard
Jan 22 at 0:09

1

Algorithmically speaking, this is as efficient as it can get.

– Olivier Melançon
Jan 22 at 0:53

2

@spaniard Although, I think you have to handle the case where find and rfind will return -1

– Olivier Melançon
Jan 22 at 0:54

2

@OlivierMelançon right! thanks I will fix it.

– spaniard
Jan 22 at 1:04

add a comment |

This should work:

def split_text(text):

    middle = len(text)//2

    under = text.rfind(" ", 0, middle)

    over = text.find(" ", middle)

    if over > under and under != -1:

        return (text[:,middle - under], text[middle - under,:])

    else:

        if over is -1:

              raise ValueError("No separator found in text '{}'".format(text))

        return (text[:,middle + over], text[middle + over,:])

it does not use a for loop, but probably using a for loop would have better performance.

I handle the case where the separator is not found in the whole string by raising an error, but change raise ValueError() for whatever way you want to handle that case.

edited Jan 22 at 1:18

Olivier Melançon

13k11940

answered Jan 22 at 0:02

spaniard

46729

2

You mean under = text.rfind(" ", 0, middle).

– CristiFati
Jan 22 at 0:05

1

Right, I will edit it.

– spaniard
Jan 22 at 0:09

1

Algorithmically speaking, this is as efficient as it can get.

– Olivier Melançon
Jan 22 at 0:53

2

@spaniard Although, I think you have to handle the case where find and rfind will return -1

– Olivier Melançon
Jan 22 at 0:54

2

@OlivierMelançon right! thanks I will fix it.

– spaniard
Jan 22 at 1:04

add a comment |

This should work:

def split_text(text):

    middle = len(text)//2

    under = text.rfind(" ", 0, middle)

    over = text.find(" ", middle)

    if over > under and under != -1:

        return (text[:,middle - under], text[middle - under,:])

    else:

        if over is -1:

              raise ValueError("No separator found in text '{}'".format(text))

        return (text[:,middle + over], text[middle + over,:])

it does not use a for loop, but probably using a for loop would have better performance.

I handle the case where the separator is not found in the whole string by raising an error, but change raise ValueError() for whatever way you want to handle that case.

edited Jan 22 at 1:18

Olivier Melançon

13k11940

answered Jan 22 at 0:02

spaniard

46729

This should work:

def split_text(text):

    middle = len(text)//2

    under = text.rfind(" ", 0, middle)

    over = text.find(" ", middle)

    if over > under and under != -1:

        return (text[:,middle - under], text[middle - under,:])

    else:

        if over is -1:

              raise ValueError("No separator found in text '{}'".format(text))

        return (text[:,middle + over], text[middle + over,:])

it does not use a for loop, but probably using a for loop would have better performance.

I handle the case where the separator is not found in the whole string by raising an error, but change raise ValueError() for whatever way you want to handle that case.

edited Jan 22 at 1:18

Olivier Melançon

13k11940

answered Jan 22 at 0:02

spaniard

46729

edited Jan 22 at 1:18

Olivier Melançon

13k11940

edited Jan 22 at 1:18

Olivier Melançon

13k11940

edited Jan 22 at 1:18

Olivier Melançon

13k11940

answered Jan 22 at 0:02

spaniard

46729

answered Jan 22 at 0:02

spaniard

46729

answered Jan 22 at 0:02

spaniard

46729

2

You mean under = text.rfind(" ", 0, middle).

– CristiFati
Jan 22 at 0:05

1

Right, I will edit it.

– spaniard
Jan 22 at 0:09

1

Algorithmically speaking, this is as efficient as it can get.

– Olivier Melançon
Jan 22 at 0:53

2

@spaniard Although, I think you have to handle the case where find and rfind will return -1

– Olivier Melançon
Jan 22 at 0:54

2

@OlivierMelançon right! thanks I will fix it.

– spaniard
Jan 22 at 1:04

add a comment |

2

You mean under = text.rfind(" ", 0, middle).

– CristiFati
Jan 22 at 0:05

1

Right, I will edit it.

– spaniard
Jan 22 at 0:09

1

Algorithmically speaking, this is as efficient as it can get.

– Olivier Melançon
Jan 22 at 0:53

2

@spaniard Although, I think you have to handle the case where find and rfind will return -1

– Olivier Melançon
Jan 22 at 0:54

2

@OlivierMelançon right! thanks I will fix it.

– spaniard
Jan 22 at 1:04

You mean under = text.rfind(" ", 0, middle).

– CristiFati
Jan 22 at 0:05

Right, I will edit it.

– spaniard
Jan 22 at 0:09

Algorithmically speaking, this is as efficient as it can get.

– Olivier Melançon
Jan 22 at 0:53

@spaniard Although, I think you have to handle the case where find and rfind will return -1

– Olivier Melançon
Jan 22 at 0:54

@OlivierMelançon right! thanks I will fix it.

– spaniard
Jan 22 at 1:04

add a comment |

You can use min to find the closest space to the middle and then slice the string.

s = "The cat jumped over the moon very quickly."



mid = min((i for i, c in enumerate(s) if c == ' '), key=lambda i: abs(i - len(s) // 2))



fst, snd = s[:mid], s[mid+1:]



print(fst)

print(snd)

Output

The cat jumped over

the moon very quickly.

answered Jan 22 at 0:48

Olivier Melançon

13k11940

Not to nitpick, but that min call consumes a generator (i.e., a loop)

– sapi
Jan 22 at 9:48

add a comment |

You can use min to find the closest space to the middle and then slice the string.

s = "The cat jumped over the moon very quickly."



mid = min((i for i, c in enumerate(s) if c == ' '), key=lambda i: abs(i - len(s) // 2))



fst, snd = s[:mid], s[mid+1:]



print(fst)

print(snd)

Output

The cat jumped over

the moon very quickly.

answered Jan 22 at 0:48

Olivier Melançon

13k11940

Not to nitpick, but that min call consumes a generator (i.e., a loop)

– sapi
Jan 22 at 9:48

add a comment |

You can use min to find the closest space to the middle and then slice the string.

s = "The cat jumped over the moon very quickly."



mid = min((i for i, c in enumerate(s) if c == ' '), key=lambda i: abs(i - len(s) // 2))



fst, snd = s[:mid], s[mid+1:]



print(fst)

print(snd)

Output

The cat jumped over

the moon very quickly.

answered Jan 22 at 0:48

Olivier Melançon

13k11940

You can use min to find the closest space to the middle and then slice the string.

s = "The cat jumped over the moon very quickly."



mid = min((i for i, c in enumerate(s) if c == ' '), key=lambda i: abs(i - len(s) // 2))



fst, snd = s[:mid], s[mid+1:]



print(fst)

print(snd)

Output

The cat jumped over

the moon very quickly.

answered Jan 22 at 0:48

Olivier Melançon

13k11940

answered Jan 22 at 0:48

Olivier Melançon

13k11940

answered Jan 22 at 0:48

Olivier Melançon

13k11940

answered Jan 22 at 0:48

Olivier Melançon

13k11940

Not to nitpick, but that min call consumes a generator (i.e., a loop)

– sapi
Jan 22 at 9:48

add a comment |

Not to nitpick, but that min call consumes a generator (i.e., a loop)

– sapi
Jan 22 at 9:48

Not to nitpick, but that min call consumes a generator (i.e., a loop)

– sapi
Jan 22 at 9:48

add a comment |

I'd just split then rejoin:

text = "The cat jumped over the moon very quickly"

words = text.split()

first_half = " ".join(words[:len(words)//2])

answered Jan 21 at 23:08

Joe Halliwell

627317

1

Depends on whether you want to split by # of words or overall string length.

– Amber
Jan 21 at 23:09

1

This split in equal amount of words, not characters

– Olivier Melançon
Jan 21 at 23:10

add a comment |

I'd just split then rejoin:

text = "The cat jumped over the moon very quickly"

words = text.split()

first_half = " ".join(words[:len(words)//2])

answered Jan 21 at 23:08

Joe Halliwell

627317

1

Depends on whether you want to split by # of words or overall string length.

– Amber
Jan 21 at 23:09

1

This split in equal amount of words, not characters

– Olivier Melançon
Jan 21 at 23:10

add a comment |

I'd just split then rejoin:

text = "The cat jumped over the moon very quickly"

words = text.split()

first_half = " ".join(words[:len(words)//2])

answered Jan 21 at 23:08

Joe Halliwell

627317

I'd just split then rejoin:

text = "The cat jumped over the moon very quickly"

words = text.split()

first_half = " ".join(words[:len(words)//2])

answered Jan 21 at 23:08

Joe Halliwell

627317

answered Jan 21 at 23:08

Joe Halliwell

627317

answered Jan 21 at 23:08

Joe Halliwell

627317

answered Jan 21 at 23:08

Joe Halliwell

627317

1

Depends on whether you want to split by # of words or overall string length.

– Amber
Jan 21 at 23:09

1

This split in equal amount of words, not characters

– Olivier Melançon
Jan 21 at 23:10

add a comment |

1

Depends on whether you want to split by # of words or overall string length.

– Amber
Jan 21 at 23:09

1

This split in equal amount of words, not characters

– Olivier Melançon
Jan 21 at 23:10

Depends on whether you want to split by # of words or overall string length.

– Amber
Jan 21 at 23:09

This split in equal amount of words, not characters

– Olivier Melançon
Jan 21 at 23:10

add a comment |

I think the solutions using split are good. I tried to solve it without split and here's what I came up with.

sOdd = "The cat jumped over the moon very quickly."

sEven = "The cat jumped over the moon very quickly now."



def split_on_delim_mid(s, delim=" "):

  delim_indexes = [

      x[0] for x in enumerate(s) if x[1]==delim

  ] # [3, 7, 14, 19, 23, 28, 33]



  # Select the correct number from delim_indexes

  middle = len(delim_indexes)/2

  if middle % 2 == 0:

    middle_index = middle

  else:

    middle_index = (middle-.5)



  # Return the separated sentances

  sep = delim_indexes[int(middle_index)]

  return s[:sep], s[sep:]



split_on_delim_mid(sOdd) # ('The cat jumped over', ' the moon very quickly.')

split_on_delim_mid(sEven) # ('The cat jumped over the', ' moon very quickly now.')

The idea here is to:

Find the indexes of the deliminator.

Find the median of that list of indexes

Split on that.

answered Jan 21 at 23:19

Charles Landau

2,3781216

add a comment |

I think the solutions using split are good. I tried to solve it without split and here's what I came up with.

sOdd = "The cat jumped over the moon very quickly."

sEven = "The cat jumped over the moon very quickly now."



def split_on_delim_mid(s, delim=" "):

  delim_indexes = [

      x[0] for x in enumerate(s) if x[1]==delim

  ] # [3, 7, 14, 19, 23, 28, 33]



  # Select the correct number from delim_indexes

  middle = len(delim_indexes)/2

  if middle % 2 == 0:

    middle_index = middle

  else:

    middle_index = (middle-.5)



  # Return the separated sentances

  sep = delim_indexes[int(middle_index)]

  return s[:sep], s[sep:]



split_on_delim_mid(sOdd) # ('The cat jumped over', ' the moon very quickly.')

split_on_delim_mid(sEven) # ('The cat jumped over the', ' moon very quickly now.')

The idea here is to:

Find the indexes of the deliminator.

Find the median of that list of indexes

Split on that.

answered Jan 21 at 23:19

Charles Landau

2,3781216

add a comment |

I think the solutions using split are good. I tried to solve it without split and here's what I came up with.

sOdd = "The cat jumped over the moon very quickly."

sEven = "The cat jumped over the moon very quickly now."



def split_on_delim_mid(s, delim=" "):

  delim_indexes = [

      x[0] for x in enumerate(s) if x[1]==delim

  ] # [3, 7, 14, 19, 23, 28, 33]



  # Select the correct number from delim_indexes

  middle = len(delim_indexes)/2

  if middle % 2 == 0:

    middle_index = middle

  else:

    middle_index = (middle-.5)



  # Return the separated sentances

  sep = delim_indexes[int(middle_index)]

  return s[:sep], s[sep:]



split_on_delim_mid(sOdd) # ('The cat jumped over', ' the moon very quickly.')

split_on_delim_mid(sEven) # ('The cat jumped over the', ' moon very quickly now.')

The idea here is to:

Find the indexes of the deliminator.

Find the median of that list of indexes

Split on that.

answered Jan 21 at 23:19

Charles Landau

2,3781216

I think the solutions using split are good. I tried to solve it without split and here's what I came up with.

sOdd = "The cat jumped over the moon very quickly."

sEven = "The cat jumped over the moon very quickly now."



def split_on_delim_mid(s, delim=" "):

  delim_indexes = [

      x[0] for x in enumerate(s) if x[1]==delim

  ] # [3, 7, 14, 19, 23, 28, 33]



  # Select the correct number from delim_indexes

  middle = len(delim_indexes)/2

  if middle % 2 == 0:

    middle_index = middle

  else:

    middle_index = (middle-.5)



  # Return the separated sentances

  sep = delim_indexes[int(middle_index)]

  return s[:sep], s[sep:]



split_on_delim_mid(sOdd) # ('The cat jumped over', ' the moon very quickly.')

split_on_delim_mid(sEven) # ('The cat jumped over the', ' moon very quickly now.')

The idea here is to:

Find the indexes of the deliminator.

Find the median of that list of indexes

Split on that.

answered Jan 21 at 23:19

Charles Landau

2,3781216

answered Jan 21 at 23:19

Charles Landau

2,3781216

answered Jan 21 at 23:19

Charles Landau

2,3781216

answered Jan 21 at 23:19

Charles Landau

2,3781216

add a comment |

But if you are fine with a list comprehension, you could do:

phrase = "The cat jumped over the moon very quickly."



#indexes of separator, here the ' '

sep_idxs = [i for i, j in enumerate(phrase) if j == ' ']



#getting the separator index closer to half the length of the string

sep = min(sep_idxs, key=lambda x:abs(x-(len(phrase) // 2)))



first_half = phrase[:sep]

last_half = phrase[sep+1:]



print([first_half, last_half])

The print statement prints ['The cat jumped over', 'the moon very quickly.']

answered Jan 21 at 23:57

Valentino

467129

add a comment |

But if you are fine with a list comprehension, you could do:

phrase = "The cat jumped over the moon very quickly."



#indexes of separator, here the ' '

sep_idxs = [i for i, j in enumerate(phrase) if j == ' ']



#getting the separator index closer to half the length of the string

sep = min(sep_idxs, key=lambda x:abs(x-(len(phrase) // 2)))



first_half = phrase[:sep]

last_half = phrase[sep+1:]



print([first_half, last_half])

The print statement prints ['The cat jumped over', 'the moon very quickly.']

answered Jan 21 at 23:57

Valentino

467129

add a comment |

But if you are fine with a list comprehension, you could do:

phrase = "The cat jumped over the moon very quickly."



#indexes of separator, here the ' '

sep_idxs = [i for i, j in enumerate(phrase) if j == ' ']



#getting the separator index closer to half the length of the string

sep = min(sep_idxs, key=lambda x:abs(x-(len(phrase) // 2)))



first_half = phrase[:sep]

last_half = phrase[sep+1:]



print([first_half, last_half])

The print statement prints ['The cat jumped over', 'the moon very quickly.']

answered Jan 21 at 23:57

Valentino

467129

But if you are fine with a list comprehension, you could do:

phrase = "The cat jumped over the moon very quickly."



#indexes of separator, here the ' '

sep_idxs = [i for i, j in enumerate(phrase) if j == ' ']



#getting the separator index closer to half the length of the string

sep = min(sep_idxs, key=lambda x:abs(x-(len(phrase) // 2)))



first_half = phrase[:sep]

last_half = phrase[sep+1:]



print([first_half, last_half])

The print statement prints ['The cat jumped over', 'the moon very quickly.']

answered Jan 21 at 23:57

Valentino

467129

answered Jan 21 at 23:57

Valentino

467129

answered Jan 21 at 23:57

Valentino

467129

answered Jan 21 at 23:57

Valentino

467129

add a comment |

def middlesplit(s,delim=" "):

    if delim not in s:

        return (s,)

    midpoint=(len(s)+1)//2

    left=s[:midpoint].rfind(delim)

    right=s[:midpoint-1:-1].rfind(delim)    

    if right>left:

        return (s[:-right-1],s[-right:])

    else:

        return (s[:left],s[left+1:])

The reason for using rfind() rather than find() is so that you can choose the larger result, making sure you avoid the -1 if only one side of your string contains delim.

answered Jan 22 at 10:00

Especially Lime

1112

add a comment |

def middlesplit(s,delim=" "):

    if delim not in s:

        return (s,)

    midpoint=(len(s)+1)//2

    left=s[:midpoint].rfind(delim)

    right=s[:midpoint-1:-1].rfind(delim)    

    if right>left:

        return (s[:-right-1],s[-right:])

    else:

        return (s[:left],s[left+1:])

The reason for using rfind() rather than find() is so that you can choose the larger result, making sure you avoid the -1 if only one side of your string contains delim.

answered Jan 22 at 10:00

Especially Lime

1112

add a comment |

def middlesplit(s,delim=" "):

    if delim not in s:

        return (s,)

    midpoint=(len(s)+1)//2

    left=s[:midpoint].rfind(delim)

    right=s[:midpoint-1:-1].rfind(delim)    

    if right>left:

        return (s[:-right-1],s[-right:])

    else:

        return (s[:left],s[left+1:])

The reason for using rfind() rather than find() is so that you can choose the larger result, making sure you avoid the -1 if only one side of your string contains delim.

answered Jan 22 at 10:00

Especially Lime

1112

def middlesplit(s,delim=" "):

    if delim not in s:

        return (s,)

    midpoint=(len(s)+1)//2

    left=s[:midpoint].rfind(delim)

    right=s[:midpoint-1:-1].rfind(delim)    

    if right>left:

        return (s[:-right-1],s[-right:])

    else:

        return (s[:left],s[left+1:])

The reason for using rfind() rather than find() is so that you can choose the larger result, making sure you avoid the -1 if only one side of your string contains delim.

answered Jan 22 at 10:00

Especially Lime

1112

answered Jan 22 at 10:00

Especially Lime

1112

answered Jan 22 at 10:00

Especially Lime

1112

answered Jan 22 at 10:00

Especially Lime

1112

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Cfrgtkky