If you are here, how the workflow works, there will be an upcoming note HEY-Screener in (Neo)Mutt, but for now, you can check all scripts on my Mutt dotfiles.
# Grabbing emails from ScreenedIn and Out from the current screener page
This is the Screener HEY URL: https://app.hey.com/my/clearances?page=3 we want to scrape from. The tag to grab is screened-person--denied and screened-person--approved.
This is the second option I created after the Console one didn’t scale and only for one page.
Then I tried to find an open API (see further below how you can find the API). As I found one, I used Python to loop through all pages with the “older” button, and then do the same again:
1 2 3 4 5 6 7 8 91011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162
| import requestsfrom bs4 import BeautifulSoupimport osdef scrape_emails(url, cookies): page = 1 denied_emails = [] approved_emails = [] with requests.Session() as session: while True: response = session.get(url, params={"page": page}, cookies=cookies) soup = BeautifulSoup(response.text, "html.parser") # Extract emails for element in soup.select(".screened-person--denied"): email = element.select_one(".screened-person__details span") if email: denied_emails.append(email.get_text(strip=True)) for element in soup.select(".screened-person--approved"): email = element.select_one(".screened-person__details span") if email: approved_emails.append(email.get_text(strip=True)) # Check for the 'Older' button/link next_page_link = soup.select_one( 'a.paginator__next[href*="/my/clearances?page="]' ) if not next_page_link: break # No more pages page += 1 # if page == 3: # break return denied_emails, approved_emailsdef write_to_file(filename, email_list): with open(filename, "w") as file: for email in email_list: file.write(f"{email}\n")cookies = { # Set ENV variable with hey cookie. Load the screener and search in network tab for `https://app.hey.com/my/clearances?page=` request. # There you see the cookies used. Might need to change after re-login "_csrf_token": os.getenv("HEY_COOKIE"),}url = "https://app.hey.com/my/clearances"denied_emails, approved_emails = scrape_emails(url, cookies)# Write the lists to fileswrite_to_file("denied_emails.txt", denied_emails)write_to_file("approved_emails.txt", approved_emails)print("Denied Emails:", denied_emails)print("Approved Emails:", approved_emails)
|
See the latest version on GitHub.
# How to get Cookie
Make sure to set the ENV cookie. You can achieve that by loading the screener and searching in the network tab for https://app.hey.com/my/clearances?page= request.
There you see the cookies used. Might need to change after re-login.
See below:

# Manually per page: Console (JavaScript)
Use the Console in the Developer mode of the browser and extract all ScreenedIn/Out from the current page:
1 2 3 4 5 6 7 8 91011121314151617181920
| const extractEmails = (className) => { const emails = []; document.querySelectorAll(`.${className}`).forEach(element => { const emailElement = element.querySelector('.screened-person__details'); if (emailElement) { const emailText = emailElement.textContent.trim(); const emailMatch = emailText.match(/\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/); if (emailMatch) { emails.push(emailMatch[0]); } } }); return emails;};const deniedEmails = extractEmails('screened-person--denied');const approvedEmails = extractEmails('screened-person--approved');console.log('Denied Emails:', deniedEmails);console.log('Approved Emails:', approvedEmails);
|
Origin: HEY-Screener in (Neo)Mutt
References: Getting the Data – Scraping
Created 2023-11-21