The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 1,858 other subscribers

Exporting Chrome History (with the “new” configuration and state file structure), and Epoch dates on various systems

Posted by jpluimers on 2025/05/02

Quite a while ago, Chrome moved from a structure based on “Current Session“, “Current Tabs“, “Last Session” and “Last Tabs” into “Session_#################” and “Tabs_#################” stored in a “Sessions” folder (and similar migrations for other state and configuration files).

The numbers in the “Session_*” and “Tabs_*” files are time stamps of those sessions, for instance one needs to figure out what the “13310808970819630” in “Session_13310808970819630” and “Session_13310808970819630” means.

Lot’s of web-pages with tips and tricks around the old structures are still around, often surfacing high in Google Search results.

I was interested in a particular trick to export Google Chrome browsing history and had a hard time figuring out the easiest solution.

Yes I know there are extensions like [Wayback/Archive] Session Buddy – Chrome Web Store (at the time of writing the last update was fall 2021) and [Wayback/Archive] Tabs Backup & Restore – Chrome Web Store (at the time of writing last update fall 2012).

Both of them are not consistent in saving automagically so upon Chrome crashes you either miss the most recent tabs or older backups are not available.

Upon crashing, Chrome sometimes does not correctly restore Tabs and Sessions file either so that’s why I wrote this post, as the History file is the only thing left to try semi-manually restore tabs.

The Tabs_* and Session_* data files are binary, start with SNSS followed by a binary version number (still 1), are undocumented and not easy to export though this might come in useful (as the internal format seems to be unchanged): [Wayback/Archive] cclgroupltd/ccl-ssns: Automatically exported from code.google.com/p/ccl-ssns

A set of Python modules for performing off-line parsing of Chrome session files (“Current Session”, “Last Session”, “Current Tabs”, “Last Tabs”).

Back to exporting the Chrome history

Dissecting the History data is a lot easier than the Tabs and Session files as the History file is stored in SQLite format.

There are even tools that can read the History SQLite database and export it for you. You can also do it yourself, which requires SQLite knowledge. That’s why I wrote my blog post Some SQLite things I recently learned a while ago.

On Windows, the by far easiest solution is [Wayback/Archive] ChromeHistoryView – View the browsing history of Chrome Web browser:

ChromeHistoryView is a small utility that reads the history data file of Google Chrome Web browser, and displays the list of all visited Web pages in the last days. For each visited Web page, the following information is displayed: URL, Title, Visit Date/Time, Number of visits, number of times that the user typed this address (Typed Count), Referrer, and Visit ID.
You can select one or more history items, and them export them into html/xml/csv/text file, or copy the information to the clipboard and paste it into Excel.

You can also use this tool to read the history of other Chromium-based Web browsers, like Microsoft Edge, Opera, Vivaldi, and others…

Note that despite versalite, it can run very slow, especially on systems with a rich history. The reason is that it loads the full history file upfront, which can result in a lot of history entries. In my case the count until some 20k records was linear then slowed down until some 200k then ChromeHistoryView seemed to hang, but I let it run anyway (as it seemed single-threaded but not message-pumping which Windows often interprets as “not responding”), then after minutes passed by suddenly became alive and showed 450k+ records over the course of slightly more than a year).

This lead me to seeing if querying with SQLite would be easier, but before that we need to understand one more thing: the way that timestamps are stored in the History database.

Tabs/Session/History timestamps on the console

A nice thing is that timestamps are stored the same way in the History file as they are in the Tabs and Session files. Which means we’re back at figuring out what the “13310808970819630” in “Session_13310808970819630” and “Session_13310808970819630” means.

The timestamp has a resolution in microseconds and contrary what you might think (with Google mainly having a nx background) the zero timestamp value is based on Windows timestamps.

Before explaining this further, let’s show some console examples on various operating systems to convert it to a human readable UTC timestamp:

  1. BSD/MacOS:
    # date -u -j -r `expr 13310808970819630 / 1000000 - 11644473600`
    Fri Oct 21 06:56:10 UTC 2022
  2. Linux:
    # date --utc --date @`expr 13310808970819630 / 1000000 - 11644473600`
    Fri Oct 21 06:56:10 UTC 2022
    
  3. Windows:
    C:\temp> PowerShell "(([System.DateTimeOffset]::FromUnixTimeSeconds(13310808970819630 / 1000000 - 11644473600)).DateTime).ToString('s')"
    2022-10-21T06:56:11
    

In case you wonder where the 1000000 and 11644473600 come from:

  • 1000000 is the number of microseconds per seconds (so Chrome uses microsecond time resolution)
  • 11644473600 is the number of seconds between the Epoch (timestamp 0) of Windows epoch (1 January 1601 00:00:00 UTC) and Unix epoch (1 January 1970 00:00:00 UTC), see further below for links.
    I think this might be related to Google protobuf as that is used in a lot of Google products to serialise data.

The History file structure

On some of my systems the Chrome History file is huge and seems to have an extremely long history. So querying it needs limitation of the records returned and for that knowing about the structure helps.

This is the schema with syntax:

C:\temp> sqlite3 C:\temp\History .schema
CREATE TABLE meta(key LONGVARCHAR NOT NULL UNIQUE PRIMARY KEY, value LONGVARCHAR);
CREATE TABLE urls(id INTEGER PRIMARY KEY AUTOINCREMENT,url LONGVARCHAR,title LONGVARCHAR,visit_count INTEGER DEFAULT 0 NOT NULL,typed_count INTEGER DEFAULT 0 NOT NULL,last_visit_time INTEGER NOT NULL,hidden INTEGER DEFAULT 0 NOT NULL);
CREATE TABLE sqlite_sequence(name,seq);
CREATE TABLE visit_source(id INTEGER PRIMARY KEY,source INTEGER NOT NULL);
CREATE TABLE keyword_search_terms (keyword_id INTEGER NOT NULL,url_id INTEGER NOT NULL,term LONGVARCHAR NOT NULL,normalized_term LONGVARCHAR NOT NULL);
CREATE INDEX keyword_search_terms_index1 ON keyword_search_terms (keyword_id, normalized_term);
CREATE INDEX keyword_search_terms_index2 ON keyword_search_terms (url_id);
CREATE INDEX keyword_search_terms_index3 ON keyword_search_terms (term);
CREATE TABLE downloads (id INTEGER PRIMARY KEY,guid VARCHAR NOT NULL,current_path LONGVARCHAR NOT NULL,target_path LONGVARCHAR NOT NULL,start_time INTEGER NOT NULL,received_bytes INTEGER NOT NULL,total_bytes INTEGER NOT NULL,state INTEGER NOT NULL,danger_type INTEGER NOT NULL,interrupt_reason INTEGER NOT NULL,hash BLOB NOT NULL,end_time INTEGER NOT NULL,opened INTEGER NOT NULL,last_access_time INTEGER NOT NULL,transient INTEGER NOT NULL,referrer VARCHAR NOT NULL,site_url VARCHAR NOT NULL,tab_url VARCHAR NOT NULL,tab_referrer_url VARCHAR NOT NULL,http_method VARCHAR NOT NULL,by_ext_id VARCHAR NOT NULL,by_ext_name VARCHAR NOT NULL,etag VARCHAR NOT NULL,last_modified VARCHAR NOT NULL,mime_type VARCHAR(255) NOT NULL,original_mime_type VARCHAR(255) NOT NULL, embedder_download_data VARCHAR NOT NULL DEFAULT '');
CREATE TABLE downloads_url_chains (id INTEGER NOT NULL,chain_index INTEGER NOT NULL,url LONGVARCHAR NOT NULL, PRIMARY KEY (id, chain_index) );
CREATE TABLE downloads_slices (download_id INTEGER NOT NULL,offset INTEGER NOT NULL,received_bytes INTEGER NOT NULL,finished INTEGER NOT NULL DEFAULT 0,PRIMARY KEY (download_id, offset) );
CREATE TABLE segments (id INTEGER PRIMARY KEY,name VARCHAR,url_id INTEGER NON NULL);
CREATE INDEX segments_name ON segments(name);
CREATE INDEX segments_url_id ON segments(url_id);
CREATE TABLE segment_usage (id INTEGER PRIMARY KEY,segment_id INTEGER NOT NULL,time_slot INTEGER NOT NULL,visit_count INTEGER DEFAULT 0 NOT NULL);
CREATE INDEX segment_usage_time_slot_segment_id ON segment_usage(time_slot, segment_id);
CREATE INDEX segments_usage_seg_id ON segment_usage(segment_id);
CREATE TABLE typed_url_sync_metadata (storage_key INTEGER PRIMARY KEY NOT NULL,value BLOB);
CREATE INDEX urls_url_index ON urls (url);
CREATE TABLE content_annotations (visit_id INTEGER PRIMARY KEY,floc_protected_score DECIMAL(3, 2),categories VARCHAR,page_topics_model_version INTEGER,annotation_flags INTEGER DEFAULT 0 NOT NULL, entities VARCHAR, related_searches VARCHAR, visibility_score NUMERIC DEFAULT -1, search_normalized_url, search_terms LONGVARCHAR, alternative_title, page_language VARCHAR, password_state INTEGER DEFAULT 0 NOT NULL);
CREATE TABLE context_annotations(visit_id INTEGER PRIMARY KEY,context_annotation_flags INTEGER DEFAULT 0 NOT NULL,duration_since_last_visit INTEGER,page_end_reason INTEGER, total_foreground_duration NUMERIC DEFAULT -1000000, browser_type INTEGER DEFAULT 0 NOT NULL, window_id INTEGER DEFAULT -1 NOT NULL, tab_id INTEGER DEFAULT -1 NOT NULL, task_id INTEGER DEFAULT -1 NOT NULL, root_task_id INTEGER DEFAULT -1 NOT NULL, parent_task_id INTEGER DEFAULT -1 NOT NULL, response_code INTEGER DEFAULT 0 NOT NULL);
CREATE TABLE downloads_reroute_info (download_id INTEGER NOT NULL,reroute_info_serialized  VARCHAR NOT NULL,PRIMARY KEY (download_id) );
CREATE TABLE IF NOT EXISTS "visits"(id INTEGER PRIMARY KEY AUTOINCREMENT,url INTEGER NOT NULL,visit_time INTEGER NOT NULL,from_visit INTEGER,transition INTEGER DEFAULT 0 NOT NULL,segment_id INTEGER,visit_duration INTEGER DEFAULT 0 NOT NULL,incremented_omnibox_typed_score BOOLEAN DEFAULT FALSE NOT NULL,opener_visit INTEGER, originator_cache_guid TEXT, originator_visit_id INTEGER, originator_from_visit INTEGER, originator_opener_visit INTEGER);
CREATE INDEX visits_url_index ON visits (url);
CREATE INDEX visits_from_index ON visits (from_visit);
CREATE INDEX visits_time_index ON visits (visit_time);
CREATE INDEX visits_originator_id_index ON visits (originator_visit_id);
CREATE TABLE cluster_keywords(cluster_id INTEGER NOT NULL,keyword VARCHAR NOT NULL,type INTEGER NOT NULL,score NUMERIC NOT NULL,collections VARCHAR NOT NULL);
CREATE TABLE clusters(cluster_id INTEGER PRIMARY KEY,should_show_on_prominent_ui_surfaces BOOLEAN NOT NULL,label VARCHAR NOT NULL,raw_label VARCHAR NOT NULL);
CREATE TABLE clusters_and_visits(cluster_id INTEGER NOT NULL,visit_id INTEGER NOT NULL,score NUMERIC NOT NULL,engagement_score NUMERIC NOT NULL,url_for_deduping LONGVARCHAR NOT NULL,normalized_url LONGVARCHAR NOT NULL,url_for_display LONGVARCHAR NOT NULL,PRIMARY KEY(cluster_id,visit_id))WITHOUT ROWID;
CREATE INDEX clusters_for_visit ON clusters_and_visits(visit_id);

It means these tables are relevant for URLs:

  • CREATE TABLE urls(
    • id INTEGER PRIMARY KEY AUTOINCREMENT,
    • url LONGVARCHAR,
    • title LONGVARCHAR,
    • visit_count INTEGER DEFAULT 0 NOT NULL,
    • typed_count INTEGER DEFAULT 0 NOT NULL,
    • last_visit_time INTEGER NOT NULL,
    • hidden INTEGER DEFAULT 0 NOT NULL);
  • CREATE TABLE downloads_url_chains (
    • id INTEGER NOT NULL,
    • chain_index INTEGER NOT NULL,
    • url LONGVARCHAR NOT NULL,
    • PRIMARY KEY (id, chain_index) );

None of them contain tab or session information but if your tabs/session files are busted there is not much else you can do than read these tables.

Sqlite syntax and errors

Given each new SQL database engine has its own gotchas, I of course bumped in a few. For example when trying to get an idea on how many URLs were stored, bumped into this:

C:\temp> sqlite3 c:\temp\History "select count(*) as all, count(distinct url) as unique from urls"
Error: in prepare, near "all": syntax error
  select count(*) as all, count(distinct url) as unique from urls
                     ^--- error here

followed by

C:\temp> sqlite3 c:\temp\History "select count(*) as any, count(distinct url) as unique from urls"
Error: in prepare, near "unique": syntax error
  select count(*) as any, count(distinct url) as unique from urls
                                   error here ---^

It looks like all and unique are reserved words under all circumstances, which means you need to quote any English word that potentially is or could become a reserved word:

C:\temp>sqlite3 c:\temp\History -json "select count(*) as 'any', count(distinct url) as 'unique' from urls"
[{"any":10373,"unique":10373}]

In this case there are no duplicate url entries in the History table. I checked systems with larger History files too and these did also not have duplicates. My assumption is that Chrome will update any existing entry with the most recent last_visit_time.

This is great, as it will make querying a lot easier.

Queries

zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz

Searches and results

Related blog posts

--jeroen

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.