Skip to content

Instantly share code, notes, and snippets.

@marcgg
Created December 8, 2010 17:26
Show Gist options
  • Select an option

  • Save marcgg/733592 to your computer and use it in GitHub Desktop.

Select an option

Save marcgg/733592 to your computer and use it in GitHub Desktop.
Regex to get the Facebook Page ID from a given URL
# Matches patterns such as:
# https://www.facebook.com/my_page_id => my_page_id
# http://www.facebook.com/my_page_id => my_page_id
# http://www.facebook.com/#!/my_page_id => my_page_id
# http://www.facebook.com/pages/Paris-France/Vanity-Url/123456?v=app_555 => 123456
# http://www.facebook.com/pages/Vanity-Url/45678 => 45678
# http://www.facebook.com/#!/page_with_1_number => page_with_1_number
# http://www.facebook.com/bounce_page#!/pages/Vanity-Url/45678 => 45678
# http://www.facebook.com/bounce_page#!/my_page_id?v=app_166292090072334 => my_page_id
# http://www.facebook.com/my.page.is.great => my.page.is.great
/(?:https?:\/\/)?(?:www\.)?facebook\.com\/(?:(?:\w)*#!\/)?(?:pages\/)?(?:[\w\-]*\/)*([\w\-\.]*)/
@michikono

Copy link
Copy Markdown

Doesn't work for paths with a trailing slash. Try adding a check for that (added (\/)?).

/(?:https?:\/\/)?(?:www\.)?facebook\.com\/(?:(?:\w)*#!\/)?(?:pages\/)?(?:[\w\-]*\/)*?(\/)?([\w\-\.]*)/

To address the UTF-8 comment: \b only does ASCII. To work with UTF-8, you need to define your own word boundaries.

The solution here is probably best to use an inverse character class ("anything that is not a slash or question mark") to find the usernames. This works in this situation since we know the only place special characters would appear is in the username.

@michikono

Copy link
Copy Markdown

Here's my attempt at filtering out other languages.

/(?:https?:\/\/)?(?:www\.)?facebook\.com\/(?:(?:\w)*#!\/)?(?:pages\/)?(?:[\w\-]*\/)*?(\/)?([^/?]*)/

Tested it against:

http://www.facebook.com/Φc?ref=hl => [2] = Φc
http://www.facebook.com/Φc/?ref=hl => [2] = Φc
http://www.facebook.com/pages/Φc/1234?ref=hl => [2] = Φc
http://www.facebook.com/pages/Φc/1234/?ref=hl => [2] = Φc

This also includes my forward slash escape code in my previous comment.

@campbell-codes

Copy link
Copy Markdown

Something you can do is two separate regex's to try and find a numeric ID first then on failure find the vanity id.
I've used this pretty basic regex to look for numbers of length 10 or greater:
/(\d{10,})/

then take the result if one is found and if one is not found use the original regex from here. Obviously this might fail if the name of the page is number1234567890 but that is a pretty special case.

I have found this to work for me pretty well but criticism welcome

Example URL:
https://www.facebook.com/pages/GHOST-Caf%C3%A9/627191887397533?fref=ts

@philippeluickx

Copy link
Copy Markdown

Anyone who wants this for Python:

https?://(www.)?facebook.com/(\w_#!/)?(pages/)?(([\w-]_/)*)?(?P<page_id>[\w.-]+)

@nkanaev

nkanaev commented May 22, 2017

Copy link
Copy Markdown

Matching fails if url contains closing slash, like https://www.facebook.com/my_page_id/
The regex below is more simpler working solution:

(?:https?:\/\/)?(?:www\.)?facebook\.com\/(?:.+\/)*([\w\.\-]+)

@Raphhh

Raphhh commented Sep 18, 2017

Copy link
Copy Markdown

@lekiend

lekiend commented Dec 6, 2017

Copy link
Copy Markdown

@BastienMottier
Just a little mistake. Unescaped slash at the end.
This below works better
^(?:https?:\/\/)?(?:www\.|m\.|touch\.)?(?:facebook\.com|fb(?:\.me|\.com))\/(?:(?:\w)*#!\/)?(?:pages\/)?(?:[\w\-]*\/)*?(\/)?([^/?\s]*)(?:\/|&|\?)?.*$

@msdinit

msdinit commented Feb 28, 2018

Copy link
Copy Markdown

@Raphhh
this one below also works for profile.php
^(?:https?:\/\/)?(?:www\.|m\.|touch\.)?(?:facebook\.com|fb(?:\.me|\.com))\/(?:(?:\w)*#!\/)?(?:pages\/)?(?:[\w\-]*\/)*?(?:\/)?(?:profile\.php\?id=)?([^/?\s]*)(?:\/|&|\?)?.*$

@arielperez82

Copy link
Copy Markdown

@lekiend There was one more missing slash.

Also, this below disallows host URLs ending in a slash with no profile e.g. https://www.facebook.com, http://fb.me, https://m.facebook.com/

^(?:https?://)?(?:www.|m.|touch.)?(?:facebook.com|fb(?:.me|.com))/(?!$)(?:(?:\w)#!/)?(?:pages/)?(?:[\w-]/)?(?:/)?(?:profile.php?id=)?([^\/?\s])(?:/|&|?)?.*$

@musasoftlabx

Copy link
Copy Markdown

where did u guys know how to write all these?

@ttodua

ttodua commented Sep 16, 2018

Copy link
Copy Markdown

doesnt work for Unicode containing pages, like this:

https://www.facebook.com/საწარმო-SabaDesign-927047470710565/?ref=safrghbeდფწერგ

@hoofdletterj

Copy link
Copy Markdown

Props for all contributers!!
Everything incorporated above, with just one more forgotten escape character added, gives me this:

/^(?:https?:\/\/)?(?:www\.|m\.|touch\.)?(?:facebook\.com|fb(?:\.me|\.com))\/(?!$)(?:(?:\w)*#!\/)?(?:pages\/)?(?:photo\.php\?fbid=)?(?:[\w\-]*\/)*?(?:\/)?(?:profile\.php\?id=)?([^\/?\s]*)(?:\/|&|\?)?.*$/

Which works GREAT (yay), except when the url has arguments after the profile.php?id= or fbid= part like these urls:

https://www.facebook.com/profile.php?id=114376375296751&fref=pb&hc_location=friends_tab
returns 114376375296751&fref=pb&hc_location=friends_tab instead of 114376375296751

and

https://www.facebook.com/photo.php?fbid=114376375296751&set=a.114376371963418.13845.114375165296872&type=1&theater
returns 114376375296751&set=a.114376371963418.13845.114375165296872&type=1&theater

Someone care to snip everything off after the first &?

@ayal

ayal commented Jul 31, 2019

Copy link
Copy Markdown

/^(?:https?://)?(?:www.|m.|touch.)?(?:facebook.com|fb(?:.me|.com))/(?!$)(?:(?:\w)#!/)?(?:pages/)?(?:photo.php?fbid=)?(?:[\w-]/)?(?:/)?(?:profile.php?id=)?([^\/?\&\s])(?:/|&|?)?.*?$/

this should exclude the & as well

@fabriciopirini

Copy link
Copy Markdown

Ayal's alternative didn't work for me. It worked when I got hoofdletterj's answer and added & before \s (Ayal's partial answer):

/^(?:https?:\/\/)?(?:www\.|m\.|touch\.)?(?:facebook\.com|fb(?:\.me|\.com))\/(?!$)(?:(?:\w)*#!\/)?(?:pages\/)?(?:photo\.php\?fbid=)?(?:[\w\-]*\/)*?(?:\/)?(?:profile\.php\?id=)?([^\/?&\s]*)(?:\/|&|\?)?.*$/

@WaqasAli853

Copy link
Copy Markdown

I want to display the post's images of Facebook on my website by just copy the address of the image is there any regex for that i have use the above regex but it doesn't help me
preg_match_all('/(https?://\S+.(?:jpg|png|gif))+/', $string, $match);
i am using this regex but it display all the other images except Facebook''s images

@alberto98fx

Copy link
Copy Markdown

I updated the regex to match even mbasic:

^(?:https?:\/\/)?(?:www\.|m\.|touch\.|mbasic\.)?(?:facebook\.com|fb(?:\.me|\.com))\/(?!$)(?:(?:\w)*#!\/)?(?:pages\/)?(?:photo\.php\?fbid=)?(?:[\w\-]*\/)*?(?:\/)?(?:profile\.php\?id=)?([^\/?&\s]*)(?:\/|&|\?)?.*$

It works for urls like:

https://mbasic.facebook.com/BMW/?refid=46&__xts__%5B0%5D=12.%7B%22unit_id_click_type%22%3A%22graph_search_results_item_tapped%22%2C%22click_type%22%3A%22result%22%2C%22module_id%22%3A2%2C%22result_id%22%3A22893372268%2C%22session_id%22%3A%22e4709b011e94ec8207a44ffedd1d2901%22%2C%22module_role%22%3A%22ENTITY_PAGES%22%2C%22unit_id%22%3A%22browse_rl%3Ab2718be4-bbd0-4764-9c31-6908c431daa2%22%2C%22browse_result_type%22%3A%22browse_type_page%22%2C%22unit_id_result_id%22%3A22893372268%2C%22module_result_position%22%3A0%7D

@alberto98fx

Copy link
Copy Markdown

There's also mobile.facebook.com, so here's the new regex:

^(?:https?:\/\/)?(?:www\.|m\.|mobile\.|touch\.|mbasic\.)?(?:facebook\.com|fb(?:\.me|\.com))\/(?!$)(?:(?:\w)*#!\/)?(?:pages\/)?(?:photo\.php\?fbid=)?(?:[\w\-]*\/)*?(?:\/)?(?:profile\.php\?id=)?([^\/?&\s]*)(?:\/|&|\?)?.*$

Which can match stuff like:

https://mobile.facebook.com/BMW/

@xtvipxtt

Copy link
Copy Markdown

@beshoo

beshoo commented Apr 11, 2021

Copy link
Copy Markdown

(?:https?:\/\/)?(?:www\.|m\.|mobile\.|touch\.|mbasic\.)?(?:facebook\.com|fb(?:\.me|\.com))\/(?!$)(?:(?:\w)*#!\/)?(?:pages\/|pg\/)?(?:photo\.php\?fbid=)?(?:[\w\-]*\/)*?(?:\/)?(?:profile\.php\?id=)?([^\/?&\s]*)(?:\/|&|\?)?.*

This will match /pg/ URL as weel
https://m.facebook.com/pg/DwayneTheRockJohnsonFanClub/photos/
...................................................^

@Hlaing567

Copy link
Copy Markdown

@mohmmdmayyas

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment