Debug Adobe Analytics implementations using Data feeds
Data feeds in Adobe analytics are a powerful full to extract hit-level data that is sent to analytics, it helps in combining adobe analytics data outside of Adobe echoscopes with other data sources or to a data lake like Azure blob, S3, FTP, sFTP locations, Google Big Query
This article helps in setting up data feeds in analytics and analyzing the raw data.
Step 1: Create Data Feed Go to Analytics –> Data Feeds → Add Give it a Name
select the report suite for which you want to set up the feed, and add the email address you want to get the notifications on whether the feed is delivered successfully or failed
Feed Interval: Determines how frequently you want to receive the data, there are two intervals Hourly and Daily Data is delivered in hourly batches at the end of each hour, or in daily batches at the end of each day.
Start & End Dates: You can select the number of days to you want the feed to run, Select continuous feed if you want to run even for upcoming future dates, you will receive
continuously without any intervention required
2. Step 2: Select Destination There are 4 types of destinations that are supported right now. FTP SFTP Azure Blob S3 Google big query is yet to be supported
3. Select the Data Feed Columns you want.
Note: One of the best practices is to select only the columns that are useful, instead of selecting all the columns, if all the columns are selected it will be increasing the processing time of the data feed queue and increases latency in the delivery of the feed. Hence, it is always recommended to select only the columns that are useful, you might be using not more than 50 columns based on your use case
Most common columns
post_columns of the above values
Here is the exhaustive list of all the columns that are available in the data feed and their corresponding definitions
4. Step 4: Download the files once received Once the feed is set up, you would start receiving the files in your destination based on the frequency.
Whenever the feed is delivered you would receive an email notification that confirms whether the data feed has been successfully delivered or not.
Once the feed is delivered, you might have to have a program to convert these files to zip files from the .tar.gz extension, they would be files with the names which have this naming convention [Reportsuite]_yyyy-mm-dd_Look-up.zip.
You would see multiple look-up files along with hit_data.tsv which contains the hits you also receive the columns. tsv, which contains all the headers for the feed
5. Step 5: Analysing the raw data Now, once the data is available, the next is to analyze the data to make data-driven decisions.
For Instance, you see the hits going in the server in the debugger, but when you check the reporting you do not see that order coming in the workspace.
The possible reason would be that the hit got excluded at the server end. You can exact the data feeds to know the root cause for this
you want to know all the orders that happened in a day and on which page URL, and what is product id which is stored in an eVar
For analyzing this use case, use these columns
eVars if any
Event_list: Comma-separated list of numbered IDs representing all the events triggered in the hit. This includes both the default events and custom events that are between 1-1000. This Uses event.tsv as the lookup file.
One of the common questions that I often get asked
What is the difference between columns with a post_ prefix and columns without a post_ prefix in data feeds ?
Columns that do not have the post_ prefix contain data that is exactly as it was sent from the website to data collection. Columns with a post_ prefix contain the values that are populated post-processing at the server end.
We can change a value server side by using variable persistence settings, processing rules, VISTA rules, currency conversion, or any other server-side logic that Adobe uses and applies at the server end. Adobe recommends using the post_ version of a column wherever it is possible.
Processing order in Adobe Analytics
The page_url contains the URL of the hit and the product_list column contains the product list as passed in through the products variable.
Products are separated by commas and follow the s.products syntax while individual product properties are separated by semicolons.
Now going back to the example of analyzing the orders through data feeds.
You can event_list column having a “1” value in it.
1 indicates that it is an Order and then you are using something like S3, you can run SQL queries to query your data based on your reporting requirements
The sample SQL would look something like this
SELECT ‘TABLE NAME’ WHERE event_list like ‘%1%’ and page_url = https://www.abc.com/thankyou.html
This query would give out all the hits which are orders.
Lookout for columns like the duplicate_purchase column, This is a flag indicating that the purchase event for this hit is ignored because it is a duplicate.
Similarly look out for how post_product_list, post_purchase_id, and curr_code. These are the crucial columns that will indicate whether the order is excluded or included
In this way, data feeds are a powerful tool to analyze how the data is collected and processed at the server level in Adobe Analytics.
2. Marketing channel attribution mismatch
Another common use case is the tracking code getting attributed to the wrong channel or all of my orders got attributed to another channel instead of one channel
The best way to understand why this has happened would be to extract the data feed for that specific day and look out for a specific visit of a visitor and start looking at hits from hit one in that visit where the order has happened.
For this scenario use these columns
va_closer_id: Numeric ID that identifies the Last touch channel dimension. Lookup for this ID can be found in the Marketing Channel Manager UI that you have setup
Lookout for hits which contained that order and parallelly look for the value in va_closer_id, as we know every hit undergoes the marketing channel processing order in the waterfall, and according to the logic in the marketing processing rules, each hit gets attributed accordingly. Based on the va_closer_id value, you would need to check what is the logic that is set behind
These are some of the useful use cases and scenarios, Data feed debugging is limitless, you can debug any implementation use cases, reporting discrepancies
Let me know your thoughts in the comments, how you have used data feeds, or if there is any scenario or use case you wish to debug using data feeds, will be glad to help.