When someone merges a pull request into a private repository in GitHub, I want to show the details of the Pull Request, including the images in the description, in another location (Slack). Usually these are short videos or screenshots of what has changed, so it would be great to have a continuous stream of changes visible to everyone in slack.
From what I can tell looking at the GitHub API Docs, there is no way to download these images via the API.
The images are stored at URL's like https://github.com/owner/project-name/assets/*
that are not publicly accessible. So you have to be logged into the browser to actually get access to the image.
When you do view an image in the browser, GitHub redirects you to a short-lived URL that looks like https://private-user-images.githubusercontent.com/123456/251885706-e74af325-a947-47f7-8dad-61129ad62f11.png?jwt=eyJ...
. This URL is public, but again, I want to generate that URL without being logged into the browser so that I can do this in response to a webhook.
For example, the PR description might have something like this:
Did a bunch of cool stuff in this one...
## What it looks like
<img width="1238" alt="Screenshot 2023-07-07 at 6 28 14 PM"
src="https://github.com/owner/project-name/assets/123456/e74af324-a944-47f4-8da4-61129ad62f14">
What I want to know is how to download the image located at https://github.com/owner/project-name/assets/123456/e74af324-a944-47f4-8da4-61129ad62f14 remotely with a script.
Get your user_session
Cookie from your browser and provision a token to access Github API.
export GH_TOKEN="<token>"
export GH_SESSION_COOKIE="<session_cookie>"
python download.py "<owner>/<repo>/pulls/<pr_number>"
#!/usr/bin/env python3
import os
import sys
import urllib.request
import json
import re
from urllib.parse import urlparse
def main():
# Read GH_TOKEN and GH_SESSION_COOKIE from environment variables
gh_token = os.environ["GH_TOKEN"]
gh_session_cookie = os.environ["GH_SESSION_COOKIE"]
# Set pull request number & repo name
path_segment = sys.argv[1]
# Get URL regexp
url_regexp = re.compile(r"https?://[^\"]+")
headers = {
"Accept": "application/vnd.github+json",
"Authorization": f"Bearer {gh_token}",
"X-GitHub-Api-Version": "2022-11-28"
}
# Download the pull request body
req = urllib.request.Request(
f"https://api.github.com/repos/{path_segment}", headers=headers)
resp = urllib.request.urlopen(req)
# Get all occurrences of URL like patterns using RegExp
body = json.loads(resp.read().decode('utf-8'))['body']
urls = url_regexp.findall(body)
# Download files from URLs
for url in urls:
headers = {
"cookie": f"user_session={gh_session_cookie};"
}
req = urllib.request.Request(url, headers=headers)
with urllib.request.urlopen(req) as u:
# Get the file name from the URL
filename = urlparse(u.geturl()).path.split('/')[-1]
with open(filename, 'wb') as f:
f.write(u.read())
if __name__ == "__main__":
main()
session_cookie
is only valid for 2 weeks. Extra care must be taken to keep it secret as that cookie allows to impersonate your Github account.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With