Scraping 3rd-Party Ticket Prices Using Stubhub's API
I have a colleague working in entertainment that needed to gather ticket prices on 3rd party sales to get a reading on the popularity and pricing of events. Currently, the colleague’s company pays a large sum to a contractor to scrape this data daily. We wanted to see if this was possible and easy to do using the StubHub API.
Update 3/5/2017: I'm glad my blog post has been helpful for a lot of users who are interested in getting started with getting data from StubHub's API using Python. But since the blog post last June, there's been a lot of changes and I wanted to update my post with the latest details.
First, Stubhub had deprecated the Inventory Search API with a newer version, so the new code is updated to reflect that.
If you're having trouble subscribing to InventorySearchAPI-v2, see the last section on this post to manually subscribe to it.
Initial Problem
The output product from the contractor’s tool is a simple flat file with event, venue information, and all the tickets currently listed for sale with seat information, quantity, and prices.
This sounds pretty straight forward and I should be able to use StubHub’s API to gather this information. So I start by doing some homework.
You can find my full code on GitHub
Getting Started with StubHub’s API
StubHub has provided a robust set of API to access its site with pretty thorough documentation, including a Getting Started Guide to sign up for a StubHub account and request for API keys.
So I spent a few hours reading up on the developer interface to create a proof of concept for my colleague.
Step 1 - Obtaining StubHub User Access Token
First step to use the API is to request an Authorization Token that my Python app will use. StubHub has some instructions using a REST client, but it’s a little different with Python.
Requesting an Anthorization Token will require us to encrypt our Consumer Key and Consumer Secret. First I enter my StubHub user account and my API info:
import requests
import base64
## Enter user's API key, secret, and Stubhub login
app_token = input('Enter app token: ')
consumer_key = input('Enter consumer key: ')
consumer_secret = input('Enter consumer secret: ')
stubhub_username = input('Enter Stubhub username (email): ')
stubhub_password = input('Enter Stubhub password: ')
Then I concat the key and secret with the colon as per the instructions, and create the basic authorization token by encrpyting it in base64.
combo = consumer_key + ':' + consumer_secret
basic_authorization_token = base64.b64encode(combo.encode('utf-8'))
Now I create a post request with the appropriate headers and use requests
to talk to StubHub. I store my response in token_response
. And I retrieve 2
fields in particular: The access_token
is what I’m after, and my user_GUID
will be handy for some API calls.
url = 'https://api.stubhub.com/login'
headers = {
'Content-Type':'application/x-www-form-urlencoded',
'Authorization':'Basic '+basic_authorization_token,}
body = {
'grant_type':'password',
'username':stubhub_username,
'password':stubhub_password,
'scope':'PRODUCTION'}
r = requests.post(url, headers=headers, data=body)
print r
print r.text
token_respoonse = r.json()
access_token = token_respoonse['access_token']
user_GUID = r.headers['X-StubHub-User-GUID']
Step 2 - Searching Inventory of an Event
To find the ticket inventory of an event, we’ll use the InventorySearch API. Of course we’ll need a specific Event ID.
There are 2 ways to get this. Let’s say my app is to track the prices of Hamilton tickets. On the event’s StubHub page, there’s a unique 7 digit in the URL that’s the Event ID. Just copy that number:
The second way involves using the EventSearchAPI - v2 much like searching for an event on the website. I leave this to the reader to explore.
With the Event ID, now it’s just a matter of making a get request with the proper headers:
inventory_url = 'https://api.stubhub.com/search/inventory/v2'
eventid = '9670859'
data = {'eventid':eventid, 'rows':200}
headers['Authorization'] = 'Bearer ' + access_token
headers['Accept'] = 'application/json'
headers['Accept-Encoding'] = 'application/json'
inventory = requests.get(inventory_url, headers=headers, params=data)
One thing to note, that this API defaults to return 100 rows. If we wany more,
I can add rows
as a parameter. See the API documentation for more details.
inventory
is my JSON results. I’ll convert it to a dictionary with
inv = inventory.json()
In particular, I want to see the ticket listing, so I’ll call the listing
key:
import pprint
pprint.pprint(inv['listing'])
[{u'currentPrice': {u'amount': 663.3, u'currency': u'USD'},
u'deliveryMethodList': [2],
u'deliveryTypeList': [2],
u'dirtyTicketInd': False,
u'listingId': 1207961705,
u'listingPrice': {u'amount': 560.0, u'currency': u'USD'},
u'quantity': 2,
u'row': u'G',
u'seatNumbers': u'9,11',
u'sectionId': 659009,
u'sectionName': u'Mezzanine Rear Sides',
u'sellerOwnInd': 0,
u'sellerSectionName': u'Mezzanine Rear Sides',
u'splitOption': u'0',
u'splitVector': [2],
u'ticketSplit': u'2',
u'zoneId': 105098,
u'zoneName': u'Mezzanine Rear'},
...
Now I want to convert the dictionary to a Pandas DataFrame. And since currentPrice
column is a nested dictionary with ticket price and currency,
I extract just the USD amount as a new column in my dataframe:
import pandas as pd
listing_df = pd.DataFrame(inv['listing'])
listing_df['amount'] = listing_df.apply(
lambda x: x['currentPrice']['amount'], axis=1)
listing_df.to_csv('tickets_listing.csv', index=False)
Here’s what the CSV file looks like now:
Step 3 - Adding Event and Venue Info
I have the ticket information, but what if I want to know some more details about the venue?
In that case, I use StubHub’s EventSearchAPI to get the details.
I already have the eventID, so I just add it to the new URL, and take a peek at the response in dict form:
info_url = 'https://api.stubhub.com/catalog/events/v2/' + eventid
info = requests.get(info_url, headers=headers)
pprint.pprint(info.json())
{u'ancestors': {u'categories': [{u'id': 174}, {u'id': 700188}],
u'groupings': [{u'id': 1500226}],
u'performers': [{u'id': 1500227}]},
u'bobId': 1,
u'categories': {u'primaryCategory': {u'id': 700188,
u'name': u'Musicals Tickets'}},
u'currencyCode': u'USD',
u'description': u'Hamilton New York Tickets',
u'eventDateLocal': u'2016-10-22T20:00:00-04:00',
u'eventDateUTC': u'2016-10-23T00:00:00+0000',
u'eventMeta': {u'keywords': u'Hamilton Richard Rodgers Theatre, Hamilton New York, Hamilton New York 10/22 0800 PM, Hamilton New York 10/22, Hamilton Richard Rodgers Theatre 10/22, buy, sell, tickets, ticket',
u'locale': u'en_US',
u'primaryAct': u'Hamilton New York',
u'primaryName': u'Hamilton New York',
u'secondaryName': u'Hamilton',
u'seoDescription': u'Hamilton 08:00 PM',
u'seoTitle': u'Hamilton Richard Rodgers Theatre New York Tickets - 2016-10-22'},
... }
Lots of relevant info here, and it’s just a matter of extracting what I need from the dict. Then I can add it to my DataFrame before exporting the final result to CSV.
Conclusion
After doing some more data cleaning to match the report’s format, I sent it over to my colleague. In a few hours, I was able to show that I can use StubHub’s API to gather the ticket data required. But there were some limitations:
My friend needed this data for about 1,200 events everyday. With StubHub’s free tier, I am limited to 10 requests per minute. If each event took 2 API calls, then this report would take 4 hours to generate everyday on the free tier. I’m sure there’s a way to pay StubHub for a higher tier access.
For now, I’ve proved that the report is possible with some API calls. And he’ll exploring some next steps with his team.
Update - Subscribing to Stubhub’s Inventory Search API - v2
Around December or January, Stubhub had deprecated version 1 of their Inventory Search API in favor of version 2, but a lot of people including me, had some difficulty subscribing to the API. Looks like by subscribing to “All API”, it doesn’t include v2 yet. And clicking on InventorySearchAPI-v2 didn’t go anywhere.
BIT I found a workaround by selecting InventorySearchAPI and then manually changing the URL from “v1” to “v2”.
And then subscribe it to your application! You can also just use this link: https://developer.stubhub.com/store/apis/info?name=InventorySearchAPI&version=v2&provider=runiu&category=Search&api=InventorySearchAPI
-->