Created
July 13, 2023 08:48
-
-
Save SbstnErhrdt/f4a0c7a62918387c03dac8f6b1039054 to your computer and use it in GitHub Desktop.
Matching publication numbers / ids from 3rd parties to EPO Patstat (USPTO / EPO)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def clean_uspto_publn_nr_to_patstat_format(dirty:str): | |
""" | |
removes 0s from uspto ids | |
"2010015376" --> "201015376" | |
"20100015376" --> "2010015376" | |
""" | |
if len(dirty) < 9: | |
return dirty | |
if dirty[0] == 3: | |
raise ValueError("Can not process id") | |
if dirty[4] == "0": | |
return dirty[0:4] + dirty[5:] | |
else: | |
return dirty | |
def clean_epo_publn_nr_to_patstat_format(dirty:str): | |
""" | |
adds 0s from to epo ids | |
"1235" --> "0001235" | |
""" | |
return dirty.zfill(7) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment