Converting a YouTube embed link to a regular link in Python

Question

I am fairly new to Python, C# is what I usually code in.

I have the following:

https://www.youtube.com/embed/FjHGZj2IjBk?controls=2&fs=0&rel=0&modestbranding=1&showinfo=0&autohide=1&iv_load_policy=3&cc_load_policy=0&autoplay=0

And I want to convert it to:

https://www.youtube.com/watch?v=FjHGZj2IjBk

I wrote a basic function, and it works, but is it done in the best, most readable way?

embed_links = [ "https://www.youtube.com/embed/NL5ZuWmrisA?controls=2&fs=0&rel=0&modestbranding=1&showinfo=0&autohide=1&iv_load_policy=3&cc_load_policy=0&autoplay=0", "https://www.youtube.com/embed/gnZImHvA0ME?controls=2&fs=0&rel=0&modestbranding=1&showinfo=0&autohide=1&iv_load_policy=3&cc_load_policy=0&autoplay=0", "https://www.youtube.com/embed/FjHGZj2IjBk?controls=2&fs=0&rel=0&modestbranding=1&showinfo=0&autohide=1&iv_load_policy=3&cc_load_policy=0&autoplay=0" ] def convert_to_regular_youtube_link(embed_link): embed_link_split = embed_link.split('/') video_id_with_params = embed_link_split[4] head, sep, tail = video_id_with_params.partition('?') video_id = head final_youtube_link = f'https://www.youtube.com/watch?v={video_id}' return final_youtube_link for link in embed_links: print(convert_to_regular_youtube_link(link))

@rak1507 while that is in some regards the best way to do it, OP also half-asked for a readable way, and that immediately disqualifies any regex-based approach — QuestionablePresence, CommentedJan 10, 2022 at 13:42
@Hobbamok Regex is great at parsing simple grammars. I'm always a fan when parser libraries support regex because then the grammars can be more readable. A blanket statement like "regex is unreadable" is just untrue. Even looking at the specifics of the top answer, the regex solution isn't unreadable. — Peilonrayz, CommentedJan 10, 2022 at 16:43

Richard Neumann · Accepted Answer · 2022-01-10 12:35:05Z

The functional approach to your problem is a good start. But relying on splitting and indexing strings, more specifially URLs and accessing their parts by index may be error-prone.

Use standard library tools to parse URLs

There are urrlib.parse and pathlib providing you with functions to parse and manipulate URLs and paths.

Naming

The parameter name embed_link may be improved upon. It is rather a URL, so why not just call it that. Also consider capitalizing global constants.

Use `if name == 'main'` guard

... so that your script does not run, if one only wants to import the function.

Use docstrings

... to document your functions. Briefly explain what they are doing.

Type hinting

You may (or may not) use type hints to clarify to the user what types of parameters your function expects and what it returns.

Suggested change:

from pathlib import Path from urllib.parse import urlparse, urlunparse EMBED_URLS = [ "https://www.youtube.com/embed/NL5ZuWmrisA?controls=2&fs=0&rel=0&modestbranding=1&showinfo=0&autohide=1&iv_load_policy=3&cc_load_policy=0&autoplay=0", "https://www.youtube.com/embed/gnZImHvA0ME?controls=2&fs=0&rel=0&modestbranding=1&showinfo=0&autohide=1&iv_load_policy=3&cc_load_policy=0&autoplay=0", "https://www.youtube.com/embed/FjHGZj2IjBk?controls=2&fs=0&rel=0&modestbranding=1&showinfo=0&autohide=1&iv_load_policy=3&cc_load_policy=0&autoplay=0" ] def convert_to_watch_url(embed_url: str) -> str: """Convert a YouTube embed URL into a watch URL.""" scheme, netloc, path, params, query, fragment = urlparse(embed_url) video_id, path = Path(path).stem, '/watch' return urlunparse((scheme, netloc, path, params, f'v={video_id}', fragment)) def main() -> None: """Convert and print the test URLs.""" for embed_url in EMBED_URLS: print(convert_to_watch_url(embed_url)) if __name__ == '__main__': main()

Thank you @RichardNeumann, that's some great feedback! Much appreciated. — J86, CommentedJan 9, 2022 at 18:08
You're welcome, but maybe wait for more reviews, before you accept any answer. — Richard Neumann, CommentedJan 9, 2022 at 18:09
If you add a type hint of -> None then static analysis tools like mypy will typecheck it. — Jasmijn, CommentedJan 10, 2022 at 9:20

ades · Accepted Answer · 2022-01-09 18:53:23Z

I think the most readable code to me would be pattern matching, i.e. using regex, and ensuring a docstring. Dealing with splits and indices is always a bother to read (it would then be better if you did tuple unpacking into named variables), and you don't need any real flexibility here anyway so it would be a very simple operation. You don't really need anything else than:

import re def convert_to_watch_url(url: str) -> str: """Converts youtube embed-url to watch-url >>> convert_to_watch_url("https://www.youtube.com/embed/FjHGZj2IjBk?controls=2&fs=0&rel=0&modestbranding=1&showinfo=0&autohide=1&iv_load_policy=3&cc_load_policy=0&autoplay=0") 'https://www.youtube.com/watch?v=FjHGZj2IjBk' """ pattern = r"^.*/embed/([a-zA-Z0-9]*)\?controls.*$" repl = r"https://www.youtube.com/watch?v=\1" return re.sub(pattern, repl, url)

Then I'd probably turn it into a script that accepts arguments:

#!/usr/bin/env python3 """Small script to convert YouTube embed URLs to watch URLs. Can be used from a shell like $ ./converter.py -h $ python3 converter.py "https://www.youtube.com/embed/FjHGZj2IjBk?controls=2&fs=0&rel=0&modestbranding=1&showinfo=0&autohide=1&iv_load_policy=3&cc_load_policy=0&autoplay=0" $ cat urls.txt > xargs python3 converter.py or from Python >>> from converter import convert_to_watch_url """ import argparse import re def convert_to_watch_url(url: str) -> str: """Converts youtube embed-url to watch-url >>> convert_to_watch_url("https://www.youtube.com/embed/FjHGZj2IjBk?controls=2&fs=0&rel=0&modestbranding=1&showinfo=0&autohide=1&iv_load_policy=3&cc_load_policy=0&autoplay=0") 'https://www.youtube.com/watch?v=FjHGZj2IjBk' """ pattern = r"^.*/embed/([a-zA-Z0-9]*)\?controls.*$" repl = r"https://www.youtube.com/watch?v=\1" return re.sub(pattern, repl, url) def main(): parser = argparse.ArgumentParser("Small script to convert URLs") parser.add_argument("URL", dest="urls", nargs="*", help="URL(s) to convert") args = parser.parse_args() for url in args.urls: print(convert_to_watch_url(url)) if __name__ == "__main__": main()

and if you want to keep your tests (but my example is also already testable with doctest), I'd use pytest and would have the test separate:

import pytest @pytest.parametrize("embed, watch", [ ("https://www.youtube.com/embed/NL5ZuWmrisA?controls=2&fs=0&rel=0&modestbranding=1&showinfo=0&autohide=1&iv_load_policy=3&cc_load_policy=0&autoplay=0", "https://www.youtube.com/watch?v=NL5ZuWmrisA") ("https://www.youtube.com/embed/gnZImHvA0ME?controls=2&fs=0&rel=0&modestbranding=1&showinfo=0&autohide=1&iv_load_policy=3&cc_load_policy=0&autoplay=0", "https://www.youtube.com/watch?v=gnZImHvA0ME") ("https://www.youtube.com/embed/FjHGZj2IjBk?controls=2&fs=0&rel=0&modestbranding=1&showinfo=0&autohide=1&iv_load_policy=3&cc_load_policy=0&autoplay=0", "https://www.youtube.com/watch?v=FjHGZj2IjBk") ]) def test_convert_watch_url_on_correct_urls(): assert convert_to_watch_url(embed) == watch

(which either needs to import the function or be in the same file).

Stack Exchange Network

Converting a YouTube embed link to a regular link in Python

2 Answers 2

Use standard library tools to parse URLs

Naming

Use `if name == 'main'` guard

Use docstrings

Type hinting

Suggested change:

Hot Network Questions

Converting a YouTube embed link to a regular link in Python

2 Answers 2

Use standard library tools to parse URLs

Naming

Use if __name__ == '__main__' guard

Use docstrings

Type hinting

Suggested change:

Related

Hot Network Questions

Use `if name == 'main'` guard