Normalize authors names in .bib file












7














As most of you may know, there are many acceptable ways of writing authors names. However, when I export .bib entries from software like Zotero (which sometimes exports Last, First names and sometimes First Last names) or JabRef (which exports fields the way you first entered them) or from the Internet, I get authors names in many different ways. Although these sources rarely provide them in incorrect or unusable ways, I'd like to normalize my .bib files so that I can Ctrl + F authors names easily, fill in their names if they are abbreviated and so on.



I am trying to use BibTool, which I already use to clean, format and sort my files. I've tried the following rules in my .bibtoolrsc file:



new.format.type = {17="%f%v%l%j"}
new.format.type = {17="%0f%0v%0l%0j"}
new.format.type = {17="%0f %0v %0l %0j"}


but when I run the bibtool command, all of my other rules work, except these ones (I've tried them separately, of course).



Here is an example of what I want. I wanted something like this:



author = {Brown, Noam and Sandholm, Tuomas}


to become this:



author = {Noam Brown and Tuomas Sandholm}


Does anyone know how to achieve this? I would prefer if I could use BibTool for everything, but if someone recommends some other command, that is acceptable too.



Edit: here is my the content of my .bibtoolrsc file.










share|improve this question
























  • According to the bibtool documentation, you should not have = before the brace, so just new.format.type {17=....}
    – Andrew Swann
    Feb 17 '17 at 8:22






  • 1




    Your attempt to sanitize is going in the wrong direction, I think: better practice is Brown, Noam and Sandholm, Tuomas.
    – jon
    Feb 17 '17 at 15:30










  • @AndrewSwann I've tried with and without the equals sign. Still, my other rules work, except for this one.
    – Douglas De Rizzo Meneghetti
    Feb 18 '17 at 17:15










  • There are spaces in the linked file not shown in your snippet here. Do they make a difference?
    – Andrew Swann
    Feb 18 '17 at 17:35










  • Neither the presence of spaces before or after the equals sign nor the presence of an equals sign between an entry name and the opening curly brace affect the running of the program. The presence of spaces after opening the curly brace and before closing it also changes nothing.
    – Douglas De Rizzo Meneghetti
    Feb 20 '17 at 7:24
















7














As most of you may know, there are many acceptable ways of writing authors names. However, when I export .bib entries from software like Zotero (which sometimes exports Last, First names and sometimes First Last names) or JabRef (which exports fields the way you first entered them) or from the Internet, I get authors names in many different ways. Although these sources rarely provide them in incorrect or unusable ways, I'd like to normalize my .bib files so that I can Ctrl + F authors names easily, fill in their names if they are abbreviated and so on.



I am trying to use BibTool, which I already use to clean, format and sort my files. I've tried the following rules in my .bibtoolrsc file:



new.format.type = {17="%f%v%l%j"}
new.format.type = {17="%0f%0v%0l%0j"}
new.format.type = {17="%0f %0v %0l %0j"}


but when I run the bibtool command, all of my other rules work, except these ones (I've tried them separately, of course).



Here is an example of what I want. I wanted something like this:



author = {Brown, Noam and Sandholm, Tuomas}


to become this:



author = {Noam Brown and Tuomas Sandholm}


Does anyone know how to achieve this? I would prefer if I could use BibTool for everything, but if someone recommends some other command, that is acceptable too.



Edit: here is my the content of my .bibtoolrsc file.










share|improve this question
























  • According to the bibtool documentation, you should not have = before the brace, so just new.format.type {17=....}
    – Andrew Swann
    Feb 17 '17 at 8:22






  • 1




    Your attempt to sanitize is going in the wrong direction, I think: better practice is Brown, Noam and Sandholm, Tuomas.
    – jon
    Feb 17 '17 at 15:30










  • @AndrewSwann I've tried with and without the equals sign. Still, my other rules work, except for this one.
    – Douglas De Rizzo Meneghetti
    Feb 18 '17 at 17:15










  • There are spaces in the linked file not shown in your snippet here. Do they make a difference?
    – Andrew Swann
    Feb 18 '17 at 17:35










  • Neither the presence of spaces before or after the equals sign nor the presence of an equals sign between an entry name and the opening curly brace affect the running of the program. The presence of spaces after opening the curly brace and before closing it also changes nothing.
    – Douglas De Rizzo Meneghetti
    Feb 20 '17 at 7:24














7












7








7







As most of you may know, there are many acceptable ways of writing authors names. However, when I export .bib entries from software like Zotero (which sometimes exports Last, First names and sometimes First Last names) or JabRef (which exports fields the way you first entered them) or from the Internet, I get authors names in many different ways. Although these sources rarely provide them in incorrect or unusable ways, I'd like to normalize my .bib files so that I can Ctrl + F authors names easily, fill in their names if they are abbreviated and so on.



I am trying to use BibTool, which I already use to clean, format and sort my files. I've tried the following rules in my .bibtoolrsc file:



new.format.type = {17="%f%v%l%j"}
new.format.type = {17="%0f%0v%0l%0j"}
new.format.type = {17="%0f %0v %0l %0j"}


but when I run the bibtool command, all of my other rules work, except these ones (I've tried them separately, of course).



Here is an example of what I want. I wanted something like this:



author = {Brown, Noam and Sandholm, Tuomas}


to become this:



author = {Noam Brown and Tuomas Sandholm}


Does anyone know how to achieve this? I would prefer if I could use BibTool for everything, but if someone recommends some other command, that is acceptable too.



Edit: here is my the content of my .bibtoolrsc file.










share|improve this question















As most of you may know, there are many acceptable ways of writing authors names. However, when I export .bib entries from software like Zotero (which sometimes exports Last, First names and sometimes First Last names) or JabRef (which exports fields the way you first entered them) or from the Internet, I get authors names in many different ways. Although these sources rarely provide them in incorrect or unusable ways, I'd like to normalize my .bib files so that I can Ctrl + F authors names easily, fill in their names if they are abbreviated and so on.



I am trying to use BibTool, which I already use to clean, format and sort my files. I've tried the following rules in my .bibtoolrsc file:



new.format.type = {17="%f%v%l%j"}
new.format.type = {17="%0f%0v%0l%0j"}
new.format.type = {17="%0f %0v %0l %0j"}


but when I run the bibtool command, all of my other rules work, except these ones (I've tried them separately, of course).



Here is an example of what I want. I wanted something like this:



author = {Brown, Noam and Sandholm, Tuomas}


to become this:



author = {Noam Brown and Tuomas Sandholm}


Does anyone know how to achieve this? I would prefer if I could use BibTool for everything, but if someone recommends some other command, that is acceptable too.



Edit: here is my the content of my .bibtoolrsc file.







bibtex bibtool






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Apr 13 '17 at 12:35









Community

1




1










asked Feb 17 '17 at 5:32









Douglas De Rizzo Meneghetti

452211




452211












  • According to the bibtool documentation, you should not have = before the brace, so just new.format.type {17=....}
    – Andrew Swann
    Feb 17 '17 at 8:22






  • 1




    Your attempt to sanitize is going in the wrong direction, I think: better practice is Brown, Noam and Sandholm, Tuomas.
    – jon
    Feb 17 '17 at 15:30










  • @AndrewSwann I've tried with and without the equals sign. Still, my other rules work, except for this one.
    – Douglas De Rizzo Meneghetti
    Feb 18 '17 at 17:15










  • There are spaces in the linked file not shown in your snippet here. Do they make a difference?
    – Andrew Swann
    Feb 18 '17 at 17:35










  • Neither the presence of spaces before or after the equals sign nor the presence of an equals sign between an entry name and the opening curly brace affect the running of the program. The presence of spaces after opening the curly brace and before closing it also changes nothing.
    – Douglas De Rizzo Meneghetti
    Feb 20 '17 at 7:24


















  • According to the bibtool documentation, you should not have = before the brace, so just new.format.type {17=....}
    – Andrew Swann
    Feb 17 '17 at 8:22






  • 1




    Your attempt to sanitize is going in the wrong direction, I think: better practice is Brown, Noam and Sandholm, Tuomas.
    – jon
    Feb 17 '17 at 15:30










  • @AndrewSwann I've tried with and without the equals sign. Still, my other rules work, except for this one.
    – Douglas De Rizzo Meneghetti
    Feb 18 '17 at 17:15










  • There are spaces in the linked file not shown in your snippet here. Do they make a difference?
    – Andrew Swann
    Feb 18 '17 at 17:35










  • Neither the presence of spaces before or after the equals sign nor the presence of an equals sign between an entry name and the opening curly brace affect the running of the program. The presence of spaces after opening the curly brace and before closing it also changes nothing.
    – Douglas De Rizzo Meneghetti
    Feb 20 '17 at 7:24
















According to the bibtool documentation, you should not have = before the brace, so just new.format.type {17=....}
– Andrew Swann
Feb 17 '17 at 8:22




According to the bibtool documentation, you should not have = before the brace, so just new.format.type {17=....}
– Andrew Swann
Feb 17 '17 at 8:22




1




1




Your attempt to sanitize is going in the wrong direction, I think: better practice is Brown, Noam and Sandholm, Tuomas.
– jon
Feb 17 '17 at 15:30




Your attempt to sanitize is going in the wrong direction, I think: better practice is Brown, Noam and Sandholm, Tuomas.
– jon
Feb 17 '17 at 15:30












@AndrewSwann I've tried with and without the equals sign. Still, my other rules work, except for this one.
– Douglas De Rizzo Meneghetti
Feb 18 '17 at 17:15




@AndrewSwann I've tried with and without the equals sign. Still, my other rules work, except for this one.
– Douglas De Rizzo Meneghetti
Feb 18 '17 at 17:15












There are spaces in the linked file not shown in your snippet here. Do they make a difference?
– Andrew Swann
Feb 18 '17 at 17:35




There are spaces in the linked file not shown in your snippet here. Do they make a difference?
– Andrew Swann
Feb 18 '17 at 17:35












Neither the presence of spaces before or after the equals sign nor the presence of an equals sign between an entry name and the opening curly brace affect the running of the program. The presence of spaces after opening the curly brace and before closing it also changes nothing.
– Douglas De Rizzo Meneghetti
Feb 20 '17 at 7:24




Neither the presence of spaces before or after the equals sign nor the presence of an equals sign between an entry name and the opening curly brace affect the running of the program. The presence of spaces after opening the curly brace and before closing it also changes nothing.
– Douglas De Rizzo Meneghetti
Feb 20 '17 at 7:24










2 Answers
2






active

oldest

votes


















1














Here is my attempt with a Python script using bibtexparser (note, it will replace the .bib files in-place! Modify the script if you do not want that):



#!/usr/bin/python
# -*- coding: utf-8 -*-

import os, sys
import re

import bibtexparser
from bibtexparser.bwriter import BibTexWriter
from bibtexparser.bibdatabase import BibDatabase
from bibtexparser.customization import convert_to_unicode
from bibtexparser.bparser import BibTexParser
import inspect, pprint

# kill stdout terminal buffering
buf_arg = 0
if sys.version_info[0] == 3:
os.environ['PYTHONUNBUFFERED'] = '1'
buf_arg = 1
sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', buf_arg)
sys.stderr = os.fdopen(sys.stderr.fileno(), 'w', buf_arg)

# EDIT FOR YOUR FILES - relative to current working dir
mybibfiles = ["path1/file1.bib", "path2/file2.bib"]

numcommas = 0
# homogenize_fields: Sanitize BibTeX field names, for example change `url` to `link` etc.
tbparser = BibTexParser()
tbparser.homogenize_fields = False # no dice
tbparser.alt_dict['url'] = 'url' # this finally prevents change 'url' to 'link'

for bibfile in mybibfiles:
print((bibfile, os.path.isfile(bibfile)))
with open(bibfile) as bibtex_file:
bibtex_str = bibtex_file.read()
bib_database = bibtexparser.loads(bibtex_str, tbparser)
pprint.pprint(bib_database.entries) # already here, would by default replace 'url' with 'link'!
bibdblen = len(bib_database.entries)
for icpbe, paperbibentry in enumerate(bib_database.entries):
authstr = paperbibentry['author']
if ("," in authstr):
numcommas += 1
report = "%d/%d: Comma present: '%s'"%(icpbe+1, bibdblen, authstr)
authstrauthors = authstr.split(" and ")
for ia, author in enumerate(authstrauthors):
if ("," in author):
authorparts = author.split(", ")
# the first part [0] is last name, needs to become last
# get and remove the first part, then append it as last
lastname = authorparts.pop(0)
authorparts.append(lastname)
authorfirstlast = " ".join(authorparts)
authstrauthors[ia] = authorfirstlast
paperbibentry['author'] = " and ".join(authstrauthors)
bib_database.entries[icpbe] = paperbibentry
report += " -> '%s'"%(paperbibentry['author'])
else:
report = "%d/%d: OK"%(icpbe+1, bibdblen)
if sys.version_info[0] == 3:
print(report)
else: #python 2
print(report.encode('utf-8'))
with open(bibfile, 'w') as thebibfile:
bibtex_str = bibtexparser.dumps(bib_database)
if sys.version_info[0]<3: # python 2
thebibfile.write(bibtex_str.encode('utf8'))
else: #python 3
thebibfile.write(bibtex_str)

print("nFound & converted total of %d author fields in format Last, First (with commas)."%(numcommas))





share|improve this answer































    1














    In trying to solve the same problem almost a year later, I found out that JabRef has an option called "Cleanup entries" under the "Quality" menu. If one adds the rule "Normalize names of persons" for the "author" and/or "editor" fields, JabRef normalizes names in the "von Last, Jr., First" format. Not exactly what the original question asks for, but since it homogenizes the way all name fields are represented in a bib file, I think it's worth mentioning.



    JabRef also points out which entries are out of spec by using the "Quality" > "Check integrity" option.



    It doesn't work with Bibtex extended name format (see section 3.8 of the biber manual for what that is).






    share|improve this answer





















      Your Answer








      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "85"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: false,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2ftex.stackexchange.com%2fquestions%2f354293%2fnormalize-authors-names-in-bib-file%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      1














      Here is my attempt with a Python script using bibtexparser (note, it will replace the .bib files in-place! Modify the script if you do not want that):



      #!/usr/bin/python
      # -*- coding: utf-8 -*-

      import os, sys
      import re

      import bibtexparser
      from bibtexparser.bwriter import BibTexWriter
      from bibtexparser.bibdatabase import BibDatabase
      from bibtexparser.customization import convert_to_unicode
      from bibtexparser.bparser import BibTexParser
      import inspect, pprint

      # kill stdout terminal buffering
      buf_arg = 0
      if sys.version_info[0] == 3:
      os.environ['PYTHONUNBUFFERED'] = '1'
      buf_arg = 1
      sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', buf_arg)
      sys.stderr = os.fdopen(sys.stderr.fileno(), 'w', buf_arg)

      # EDIT FOR YOUR FILES - relative to current working dir
      mybibfiles = ["path1/file1.bib", "path2/file2.bib"]

      numcommas = 0
      # homogenize_fields: Sanitize BibTeX field names, for example change `url` to `link` etc.
      tbparser = BibTexParser()
      tbparser.homogenize_fields = False # no dice
      tbparser.alt_dict['url'] = 'url' # this finally prevents change 'url' to 'link'

      for bibfile in mybibfiles:
      print((bibfile, os.path.isfile(bibfile)))
      with open(bibfile) as bibtex_file:
      bibtex_str = bibtex_file.read()
      bib_database = bibtexparser.loads(bibtex_str, tbparser)
      pprint.pprint(bib_database.entries) # already here, would by default replace 'url' with 'link'!
      bibdblen = len(bib_database.entries)
      for icpbe, paperbibentry in enumerate(bib_database.entries):
      authstr = paperbibentry['author']
      if ("," in authstr):
      numcommas += 1
      report = "%d/%d: Comma present: '%s'"%(icpbe+1, bibdblen, authstr)
      authstrauthors = authstr.split(" and ")
      for ia, author in enumerate(authstrauthors):
      if ("," in author):
      authorparts = author.split(", ")
      # the first part [0] is last name, needs to become last
      # get and remove the first part, then append it as last
      lastname = authorparts.pop(0)
      authorparts.append(lastname)
      authorfirstlast = " ".join(authorparts)
      authstrauthors[ia] = authorfirstlast
      paperbibentry['author'] = " and ".join(authstrauthors)
      bib_database.entries[icpbe] = paperbibentry
      report += " -> '%s'"%(paperbibentry['author'])
      else:
      report = "%d/%d: OK"%(icpbe+1, bibdblen)
      if sys.version_info[0] == 3:
      print(report)
      else: #python 2
      print(report.encode('utf-8'))
      with open(bibfile, 'w') as thebibfile:
      bibtex_str = bibtexparser.dumps(bib_database)
      if sys.version_info[0]<3: # python 2
      thebibfile.write(bibtex_str.encode('utf8'))
      else: #python 3
      thebibfile.write(bibtex_str)

      print("nFound & converted total of %d author fields in format Last, First (with commas)."%(numcommas))





      share|improve this answer




























        1














        Here is my attempt with a Python script using bibtexparser (note, it will replace the .bib files in-place! Modify the script if you do not want that):



        #!/usr/bin/python
        # -*- coding: utf-8 -*-

        import os, sys
        import re

        import bibtexparser
        from bibtexparser.bwriter import BibTexWriter
        from bibtexparser.bibdatabase import BibDatabase
        from bibtexparser.customization import convert_to_unicode
        from bibtexparser.bparser import BibTexParser
        import inspect, pprint

        # kill stdout terminal buffering
        buf_arg = 0
        if sys.version_info[0] == 3:
        os.environ['PYTHONUNBUFFERED'] = '1'
        buf_arg = 1
        sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', buf_arg)
        sys.stderr = os.fdopen(sys.stderr.fileno(), 'w', buf_arg)

        # EDIT FOR YOUR FILES - relative to current working dir
        mybibfiles = ["path1/file1.bib", "path2/file2.bib"]

        numcommas = 0
        # homogenize_fields: Sanitize BibTeX field names, for example change `url` to `link` etc.
        tbparser = BibTexParser()
        tbparser.homogenize_fields = False # no dice
        tbparser.alt_dict['url'] = 'url' # this finally prevents change 'url' to 'link'

        for bibfile in mybibfiles:
        print((bibfile, os.path.isfile(bibfile)))
        with open(bibfile) as bibtex_file:
        bibtex_str = bibtex_file.read()
        bib_database = bibtexparser.loads(bibtex_str, tbparser)
        pprint.pprint(bib_database.entries) # already here, would by default replace 'url' with 'link'!
        bibdblen = len(bib_database.entries)
        for icpbe, paperbibentry in enumerate(bib_database.entries):
        authstr = paperbibentry['author']
        if ("," in authstr):
        numcommas += 1
        report = "%d/%d: Comma present: '%s'"%(icpbe+1, bibdblen, authstr)
        authstrauthors = authstr.split(" and ")
        for ia, author in enumerate(authstrauthors):
        if ("," in author):
        authorparts = author.split(", ")
        # the first part [0] is last name, needs to become last
        # get and remove the first part, then append it as last
        lastname = authorparts.pop(0)
        authorparts.append(lastname)
        authorfirstlast = " ".join(authorparts)
        authstrauthors[ia] = authorfirstlast
        paperbibentry['author'] = " and ".join(authstrauthors)
        bib_database.entries[icpbe] = paperbibentry
        report += " -> '%s'"%(paperbibentry['author'])
        else:
        report = "%d/%d: OK"%(icpbe+1, bibdblen)
        if sys.version_info[0] == 3:
        print(report)
        else: #python 2
        print(report.encode('utf-8'))
        with open(bibfile, 'w') as thebibfile:
        bibtex_str = bibtexparser.dumps(bib_database)
        if sys.version_info[0]<3: # python 2
        thebibfile.write(bibtex_str.encode('utf8'))
        else: #python 3
        thebibfile.write(bibtex_str)

        print("nFound & converted total of %d author fields in format Last, First (with commas)."%(numcommas))





        share|improve this answer


























          1












          1








          1






          Here is my attempt with a Python script using bibtexparser (note, it will replace the .bib files in-place! Modify the script if you do not want that):



          #!/usr/bin/python
          # -*- coding: utf-8 -*-

          import os, sys
          import re

          import bibtexparser
          from bibtexparser.bwriter import BibTexWriter
          from bibtexparser.bibdatabase import BibDatabase
          from bibtexparser.customization import convert_to_unicode
          from bibtexparser.bparser import BibTexParser
          import inspect, pprint

          # kill stdout terminal buffering
          buf_arg = 0
          if sys.version_info[0] == 3:
          os.environ['PYTHONUNBUFFERED'] = '1'
          buf_arg = 1
          sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', buf_arg)
          sys.stderr = os.fdopen(sys.stderr.fileno(), 'w', buf_arg)

          # EDIT FOR YOUR FILES - relative to current working dir
          mybibfiles = ["path1/file1.bib", "path2/file2.bib"]

          numcommas = 0
          # homogenize_fields: Sanitize BibTeX field names, for example change `url` to `link` etc.
          tbparser = BibTexParser()
          tbparser.homogenize_fields = False # no dice
          tbparser.alt_dict['url'] = 'url' # this finally prevents change 'url' to 'link'

          for bibfile in mybibfiles:
          print((bibfile, os.path.isfile(bibfile)))
          with open(bibfile) as bibtex_file:
          bibtex_str = bibtex_file.read()
          bib_database = bibtexparser.loads(bibtex_str, tbparser)
          pprint.pprint(bib_database.entries) # already here, would by default replace 'url' with 'link'!
          bibdblen = len(bib_database.entries)
          for icpbe, paperbibentry in enumerate(bib_database.entries):
          authstr = paperbibentry['author']
          if ("," in authstr):
          numcommas += 1
          report = "%d/%d: Comma present: '%s'"%(icpbe+1, bibdblen, authstr)
          authstrauthors = authstr.split(" and ")
          for ia, author in enumerate(authstrauthors):
          if ("," in author):
          authorparts = author.split(", ")
          # the first part [0] is last name, needs to become last
          # get and remove the first part, then append it as last
          lastname = authorparts.pop(0)
          authorparts.append(lastname)
          authorfirstlast = " ".join(authorparts)
          authstrauthors[ia] = authorfirstlast
          paperbibentry['author'] = " and ".join(authstrauthors)
          bib_database.entries[icpbe] = paperbibentry
          report += " -> '%s'"%(paperbibentry['author'])
          else:
          report = "%d/%d: OK"%(icpbe+1, bibdblen)
          if sys.version_info[0] == 3:
          print(report)
          else: #python 2
          print(report.encode('utf-8'))
          with open(bibfile, 'w') as thebibfile:
          bibtex_str = bibtexparser.dumps(bib_database)
          if sys.version_info[0]<3: # python 2
          thebibfile.write(bibtex_str.encode('utf8'))
          else: #python 3
          thebibfile.write(bibtex_str)

          print("nFound & converted total of %d author fields in format Last, First (with commas)."%(numcommas))





          share|improve this answer














          Here is my attempt with a Python script using bibtexparser (note, it will replace the .bib files in-place! Modify the script if you do not want that):



          #!/usr/bin/python
          # -*- coding: utf-8 -*-

          import os, sys
          import re

          import bibtexparser
          from bibtexparser.bwriter import BibTexWriter
          from bibtexparser.bibdatabase import BibDatabase
          from bibtexparser.customization import convert_to_unicode
          from bibtexparser.bparser import BibTexParser
          import inspect, pprint

          # kill stdout terminal buffering
          buf_arg = 0
          if sys.version_info[0] == 3:
          os.environ['PYTHONUNBUFFERED'] = '1'
          buf_arg = 1
          sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', buf_arg)
          sys.stderr = os.fdopen(sys.stderr.fileno(), 'w', buf_arg)

          # EDIT FOR YOUR FILES - relative to current working dir
          mybibfiles = ["path1/file1.bib", "path2/file2.bib"]

          numcommas = 0
          # homogenize_fields: Sanitize BibTeX field names, for example change `url` to `link` etc.
          tbparser = BibTexParser()
          tbparser.homogenize_fields = False # no dice
          tbparser.alt_dict['url'] = 'url' # this finally prevents change 'url' to 'link'

          for bibfile in mybibfiles:
          print((bibfile, os.path.isfile(bibfile)))
          with open(bibfile) as bibtex_file:
          bibtex_str = bibtex_file.read()
          bib_database = bibtexparser.loads(bibtex_str, tbparser)
          pprint.pprint(bib_database.entries) # already here, would by default replace 'url' with 'link'!
          bibdblen = len(bib_database.entries)
          for icpbe, paperbibentry in enumerate(bib_database.entries):
          authstr = paperbibentry['author']
          if ("," in authstr):
          numcommas += 1
          report = "%d/%d: Comma present: '%s'"%(icpbe+1, bibdblen, authstr)
          authstrauthors = authstr.split(" and ")
          for ia, author in enumerate(authstrauthors):
          if ("," in author):
          authorparts = author.split(", ")
          # the first part [0] is last name, needs to become last
          # get and remove the first part, then append it as last
          lastname = authorparts.pop(0)
          authorparts.append(lastname)
          authorfirstlast = " ".join(authorparts)
          authstrauthors[ia] = authorfirstlast
          paperbibentry['author'] = " and ".join(authstrauthors)
          bib_database.entries[icpbe] = paperbibentry
          report += " -> '%s'"%(paperbibentry['author'])
          else:
          report = "%d/%d: OK"%(icpbe+1, bibdblen)
          if sys.version_info[0] == 3:
          print(report)
          else: #python 2
          print(report.encode('utf-8'))
          with open(bibfile, 'w') as thebibfile:
          bibtex_str = bibtexparser.dumps(bib_database)
          if sys.version_info[0]<3: # python 2
          thebibfile.write(bibtex_str.encode('utf8'))
          else: #python 3
          thebibfile.write(bibtex_str)

          print("nFound & converted total of %d author fields in format Last, First (with commas)."%(numcommas))






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Aug 24 '17 at 7:58

























          answered Aug 23 '17 at 16:23









          sdaau

          9,111647126




          9,111647126























              1














              In trying to solve the same problem almost a year later, I found out that JabRef has an option called "Cleanup entries" under the "Quality" menu. If one adds the rule "Normalize names of persons" for the "author" and/or "editor" fields, JabRef normalizes names in the "von Last, Jr., First" format. Not exactly what the original question asks for, but since it homogenizes the way all name fields are represented in a bib file, I think it's worth mentioning.



              JabRef also points out which entries are out of spec by using the "Quality" > "Check integrity" option.



              It doesn't work with Bibtex extended name format (see section 3.8 of the biber manual for what that is).






              share|improve this answer


























                1














                In trying to solve the same problem almost a year later, I found out that JabRef has an option called "Cleanup entries" under the "Quality" menu. If one adds the rule "Normalize names of persons" for the "author" and/or "editor" fields, JabRef normalizes names in the "von Last, Jr., First" format. Not exactly what the original question asks for, but since it homogenizes the way all name fields are represented in a bib file, I think it's worth mentioning.



                JabRef also points out which entries are out of spec by using the "Quality" > "Check integrity" option.



                It doesn't work with Bibtex extended name format (see section 3.8 of the biber manual for what that is).






                share|improve this answer
























                  1












                  1








                  1






                  In trying to solve the same problem almost a year later, I found out that JabRef has an option called "Cleanup entries" under the "Quality" menu. If one adds the rule "Normalize names of persons" for the "author" and/or "editor" fields, JabRef normalizes names in the "von Last, Jr., First" format. Not exactly what the original question asks for, but since it homogenizes the way all name fields are represented in a bib file, I think it's worth mentioning.



                  JabRef also points out which entries are out of spec by using the "Quality" > "Check integrity" option.



                  It doesn't work with Bibtex extended name format (see section 3.8 of the biber manual for what that is).






                  share|improve this answer












                  In trying to solve the same problem almost a year later, I found out that JabRef has an option called "Cleanup entries" under the "Quality" menu. If one adds the rule "Normalize names of persons" for the "author" and/or "editor" fields, JabRef normalizes names in the "von Last, Jr., First" format. Not exactly what the original question asks for, but since it homogenizes the way all name fields are represented in a bib file, I think it's worth mentioning.



                  JabRef also points out which entries are out of spec by using the "Quality" > "Check integrity" option.



                  It doesn't work with Bibtex extended name format (see section 3.8 of the biber manual for what that is).







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Dec 28 '18 at 16:20









                  Douglas De Rizzo Meneghetti

                  452211




                  452211






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to TeX - LaTeX Stack Exchange!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.





                      Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                      Please pay close attention to the following guidance:


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2ftex.stackexchange.com%2fquestions%2f354293%2fnormalize-authors-names-in-bib-file%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      How to change which sound is reproduced for terminal bell?

                      Can I use Tabulator js library in my java Spring + Thymeleaf project?

                      Title Spacing in Bjornstrup Chapter, Removing Chapter Number From Contents