Created
July 12, 2025 08:24
-
-
Save quicksilver0/406e7bc9421837dc9f4af6ede28799ef to your computer and use it in GitHub Desktop.
Find abbreviations
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# find abbreviations and their positions (start and end indexes) in the text | |
import re | |
text = "США и ООН обсуждали события на G20 и 2FA. Также упоминался NASA." | |
# Regular expression for abbreviations | |
pattern = r'\b[A-ZА-ЯЁ0-9]{2,}\b' | |
# Use re.finditer to get match objects with position info | |
matches = re.finditer(pattern, text) | |
# Collect (matched abbreviation, start_index, end_index) | |
results = [(match.group(), match.start(), match.end()) for match in matches] | |
# Print results | |
for abbr, start, end in results: | |
print(f"Found '{abbr}' at index {start}-{end}") | |
# Output | |
# Found 'США' at index 0-3 | |
# Found 'ООН' at index 6-9 | |
# Found 'G20' at index 35-38 | |
# Found '2FA' at index 41-44 | |
# Found 'NASA' at index 64-68 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment