openalex_local API

OpenAlex Local - Local OpenAlex database with 284M+ works and semantic search.

Example

>>> from openalex_local import search, get
>>> results = search("machine learning neural networks")
>>> work = get("W2741809807")  # OpenAlex ID
>>> work = get("10.1038/nature12373")  # or DOI
openalex_local.search(query, limit=20, offset=0)[source]

Full-text search across works.

Uses FTS5 index for fast searching across titles and abstracts.

Parameters:
  • query (str) – Search query (supports FTS5 syntax)

  • limit (int) – Maximum results to return

  • offset (int) – Skip first N results (for pagination)

Return type:

SearchResult

Returns:

SearchResult with matching works

Example

>>> from openalex_local import search
>>> results = search("machine learning")
>>> print(f"Found {results.total} matches")
openalex_local.count(query)[source]

Count matching works without fetching results.

Parameters:

query (str) – FTS5 search query

Return type:

int

Returns:

Number of matching works

openalex_local.get(id_or_doi)[source]

Get a work by OpenAlex ID or DOI.

Parameters:

id_or_doi (str) – OpenAlex ID (e.g., W2741809807) or DOI

Return type:

Optional[Work]

Returns:

Work object or None if not found

Example

>>> from openalex_local import get
>>> work = get("W2741809807")
>>> work = get("10.1038/nature12373")
>>> print(work.title)
openalex_local.get_many(ids)[source]

Get multiple works by OpenAlex ID or DOI.

Parameters:

ids (List[str]) – List of OpenAlex IDs or DOIs

Return type:

List[Work]

Returns:

List of Work objects (missing IDs are skipped)

openalex_local.exists(id_or_doi)[source]

Check if a work exists in the database.

Parameters:

id_or_doi (str) – OpenAlex ID or DOI

Return type:

bool

Returns:

True if work exists

openalex_local.info()[source]

Get database/API information.

Return type:

dict

Returns:

Dictionary with database stats and mode info

Raises:

FileNotFoundError – If no database configured and HTTP mode unavailable

openalex_local.enrich(results, include_abstract=True, include_concepts=True)[source]

Enrich search results with full metadata.

This function re-fetches works from the database to ensure all fields are populated, including abstract and concepts which may be truncated in search results.

Parameters:
  • results (SearchResult) – SearchResult from a search query

  • include_abstract (bool) – Include full abstract text (default True)

  • include_concepts (bool) – Include concept/topic data (default True)

Return type:

SearchResult

Returns:

SearchResult with enriched Work objects

Example

>>> results = search("machine learning", limit=10)
>>> enriched = enrich(results)
>>> for work in enriched:
...     print(work.abstract)  # Full abstract available
openalex_local.enrich_ids(ids, include_abstract=True, include_concepts=True)[source]

Enrich a list of OpenAlex IDs or DOIs with full metadata.

Parameters:
  • ids (List[str]) – List of OpenAlex IDs (e.g., W2741809807) or DOIs

  • include_abstract (bool) – Include full abstract text (default True)

  • include_concepts (bool) – Include concept/topic data (default True)

Return type:

List[Work]

Returns:

List of Work objects with full metadata

Example

>>> ids = ["W2741809807", "10.1038/nature12373"]
>>> works = enrich_ids(ids)
>>> for work in works:
...     print(f"{work.title}: {work.cited_by_count} citations")
openalex_local.configure(db_path)[source]

Configure for local database access.

Parameters:

db_path (str) – Path to OpenAlex SQLite database

Return type:

None

Example

>>> from openalex_local import configure
>>> configure("/path/to/openalex.db")
openalex_local.get_mode()[source]

Get current mode.

Return type:

str

Returns:

“db” or “http”

class openalex_local.Work(openalex_id, doi=None, title=None, abstract=None, authors=<factory>, year=None, source=None, issn=None, volume=None, issue=None, pages=None, publisher=None, type=None, concepts=<factory>, topics=<factory>, cited_by_count=None, referenced_works=<factory>, is_oa=False, oa_url=None, scitex_if=None, source_h_index=None, source_cited_by_count=None)[source]

Bases: object

Represents a scholarly work from OpenAlex.

openalex_id

OpenAlex ID (e.g., W2741809807)

doi

Digital Object Identifier

title

Work title

abstract

Abstract text (reconstructed from inverted index)

authors

List of author names

year

Publication year

source

Journal/venue name

issn

Journal ISSN

volume

Volume number

issue

Issue number

pages

Page range

publisher

Publisher name

type

Work type (journal-article, book-chapter, etc.)

concepts

List of OpenAlex concepts

topics

List of OpenAlex topics

cited_by_count

Number of citations

referenced_works

List of referenced OpenAlex IDs

is_oa

Is open access

oa_url

Open access URL

openalex_id: str
doi: Optional[str] = None
title: Optional[str] = None
abstract: Optional[str] = None
authors: List[str]
year: Optional[int] = None
source: Optional[str] = None
issn: Optional[str] = None
volume: Optional[str] = None
issue: Optional[str] = None
pages: Optional[str] = None
publisher: Optional[str] = None
type: Optional[str] = None
concepts: List[Dict[str, Any]]
topics: List[Dict[str, Any]]
cited_by_count: Optional[int] = None
referenced_works: List[str]
is_oa: bool = False
oa_url: Optional[str] = None
scitex_if: Optional[float] = None
source_h_index: Optional[int] = None
source_cited_by_count: Optional[int] = None
classmethod from_openalex(data)[source]

Create Work from OpenAlex API/snapshot JSON.

Parameters:

data (dict) – OpenAlex work dictionary

Return type:

Work

Returns:

Work instance

classmethod from_db_row(data)[source]

Create Work from database row dictionary.

Parameters:

data (dict) – Database row as dictionary (with parsed JSON fields)

Return type:

Work

Returns:

Work instance

to_dict()[source]

Convert to dictionary.

Return type:

dict

citation(style='apa')[source]

Format work as a citation string.

Parameters:

style (str) – Citation style - “apa” (default) or “bibtex”

Return type:

str

Returns:

Formatted citation string

Example

>>> work.citation()  # APA format
'Piwowar, H., & Priem, J. (2018). The state of OA. PeerJ.'
>>> work.citation("bibtex")  # BibTeX format
'@article{W2741809807, title={The state of OA}, ...}'
_citation_apa()[source]

Format as APA citation.

Return type:

str

_format_author_apa(name)[source]

Format author name for APA (Last, F. M.).

Return type:

str

_citation_bibtex()[source]

Format as BibTeX entry.

Return type:

str

to_text(include_abstract=False)[source]

Format as human-readable text.

Parameters:

include_abstract (bool) – Include abstract in output

Return type:

str

Returns:

Formatted text string

save(path, format='json')[source]

Save work to file.

Parameters:
  • path (str) – Output file path

  • format (str) – Output format (“text”, “json”, “bibtex”)

Return type:

str

Returns:

Path to saved file

Examples

>>> work = get("W2741809807")
>>> work.save("paper.json")
>>> work.save("paper.bib", format="bibtex")
__init__(openalex_id, doi=None, title=None, abstract=None, authors=<factory>, year=None, source=None, issn=None, volume=None, issue=None, pages=None, publisher=None, type=None, concepts=<factory>, topics=<factory>, cited_by_count=None, referenced_works=<factory>, is_oa=False, oa_url=None, scitex_if=None, source_h_index=None, source_cited_by_count=None)
class openalex_local.SearchResult(works, total, query, elapsed_ms)[source]

Bases: object

Container for search results with metadata.

works

List of Work objects

total

Total number of matches

query

Original search query

elapsed_ms

Search time in milliseconds

works: List[Work]
total: int
query: str
elapsed_ms: float
save(path, format='json', include_abstract=True)[source]

Save search results to file.

Parameters:
  • path (str) – Output file path

  • format (str) – Output format (“text”, “json”, “bibtex”)

  • include_abstract (bool) – Include abstracts in text format

Return type:

str

Returns:

Path to saved file

Examples

>>> results = search("machine learning", limit=10)
>>> results.save("results.json")
>>> results.save("results.bib", format="bibtex")
>>> results.save("results.txt", format="text")
__init__(works, total, query, elapsed_ms)
openalex_local.save(data, path, format='json', include_abstract=True)[source]

Save Work(s) or SearchResult to a file.

Parameters:
  • data (Union[Work, SearchResult, List[Work]]) – Work, SearchResult, or list of Works to save

  • path (Union[str, Path]) – Output file path

  • format (str) – Output format (“text”, “json”, “bibtex”)

  • include_abstract (bool) – Include abstracts in text format

Return type:

str

Returns:

Path to saved file

Raises:

ValueError – If format is not supported

Examples

>>> from openalex_local import search, save
>>> results = search("machine learning", limit=10)
>>> save(results, "results.json")
>>> save(results, "results.bib", format="bibtex")
>>> save(results, "results.txt", format="text")

Core Functions

get

openalex_local.get(id_or_doi)[source]

Get a work by OpenAlex ID or DOI.

Parameters:

id_or_doi (str) – OpenAlex ID (e.g., W2741809807) or DOI

Return type:

Optional[Work]

Returns:

Work object or None if not found

Example

>>> from openalex_local import get
>>> work = get("W2741809807")
>>> work = get("10.1038/nature12373")
>>> print(work.title)

count

openalex_local.count(query)[source]

Count matching works without fetching results.

Parameters:

query (str) – FTS5 search query

Return type:

int

Returns:

Number of matching works

info

openalex_local.info()[source]

Get database/API information.

Return type:

dict

Returns:

Dictionary with database stats and mode info

Raises:

FileNotFoundError – If no database configured and HTTP mode unavailable

Configuration

configure

openalex_local.configure(db_path)[source]

Configure for local database access.

Parameters:

db_path (str) – Path to OpenAlex SQLite database

Return type:

None

Example

>>> from openalex_local import configure
>>> configure("/path/to/openalex.db")

configure_http

get_mode

openalex_local.get_mode()[source]

Get current mode.

Return type:

str

Returns:

“db” or “http”

Data Classes

Work

class openalex_local.Work(openalex_id, doi=None, title=None, abstract=None, authors=<factory>, year=None, source=None, issn=None, volume=None, issue=None, pages=None, publisher=None, type=None, concepts=<factory>, topics=<factory>, cited_by_count=None, referenced_works=<factory>, is_oa=False, oa_url=None, scitex_if=None, source_h_index=None, source_cited_by_count=None)[source]

Bases: object

Represents a scholarly work from OpenAlex.

openalex_id

OpenAlex ID (e.g., W2741809807)

doi

Digital Object Identifier

title

Work title

abstract

Abstract text (reconstructed from inverted index)

authors

List of author names

year

Publication year

source

Journal/venue name

issn

Journal ISSN

volume

Volume number

issue

Issue number

pages

Page range

publisher

Publisher name

type

Work type (journal-article, book-chapter, etc.)

concepts

List of OpenAlex concepts

topics

List of OpenAlex topics

cited_by_count

Number of citations

referenced_works

List of referenced OpenAlex IDs

is_oa

Is open access

oa_url

Open access URL

openalex_id: str
doi: Optional[str] = None
title: Optional[str] = None
abstract: Optional[str] = None
authors: List[str]
year: Optional[int] = None
source: Optional[str] = None
issn: Optional[str] = None
volume: Optional[str] = None
issue: Optional[str] = None
pages: Optional[str] = None
publisher: Optional[str] = None
type: Optional[str] = None
concepts: List[Dict[str, Any]]
topics: List[Dict[str, Any]]
cited_by_count: Optional[int] = None
referenced_works: List[str]
is_oa: bool = False
oa_url: Optional[str] = None
scitex_if: Optional[float] = None
source_h_index: Optional[int] = None
source_cited_by_count: Optional[int] = None
classmethod from_openalex(data)[source]

Create Work from OpenAlex API/snapshot JSON.

Parameters:

data (dict) – OpenAlex work dictionary

Return type:

Work

Returns:

Work instance

classmethod from_db_row(data)[source]

Create Work from database row dictionary.

Parameters:

data (dict) – Database row as dictionary (with parsed JSON fields)

Return type:

Work

Returns:

Work instance

to_dict()[source]

Convert to dictionary.

Return type:

dict

citation(style='apa')[source]

Format work as a citation string.

Parameters:

style (str) – Citation style - “apa” (default) or “bibtex”

Return type:

str

Returns:

Formatted citation string

Example

>>> work.citation()  # APA format
'Piwowar, H., & Priem, J. (2018). The state of OA. PeerJ.'
>>> work.citation("bibtex")  # BibTeX format
'@article{W2741809807, title={The state of OA}, ...}'
_citation_apa()[source]

Format as APA citation.

Return type:

str

_format_author_apa(name)[source]

Format author name for APA (Last, F. M.).

Return type:

str

_citation_bibtex()[source]

Format as BibTeX entry.

Return type:

str

to_text(include_abstract=False)[source]

Format as human-readable text.

Parameters:

include_abstract (bool) – Include abstract in output

Return type:

str

Returns:

Formatted text string

save(path, format='json')[source]

Save work to file.

Parameters:
  • path (str) – Output file path

  • format (str) – Output format (“text”, “json”, “bibtex”)

Return type:

str

Returns:

Path to saved file

Examples

>>> work = get("W2741809807")
>>> work.save("paper.json")
>>> work.save("paper.bib", format="bibtex")
__init__(openalex_id, doi=None, title=None, abstract=None, authors=<factory>, year=None, source=None, issn=None, volume=None, issue=None, pages=None, publisher=None, type=None, concepts=<factory>, topics=<factory>, cited_by_count=None, referenced_works=<factory>, is_oa=False, oa_url=None, scitex_if=None, source_h_index=None, source_cited_by_count=None)

SearchResult

class openalex_local.SearchResult(works, total, query, elapsed_ms)[source]

Bases: object

Container for search results with metadata.

works

List of Work objects

total

Total number of matches

query

Original search query

elapsed_ms

Search time in milliseconds

works: List[Work]
total: int
query: str
elapsed_ms: float
save(path, format='json', include_abstract=True)[source]

Save search results to file.

Parameters:
  • path (str) – Output file path

  • format (str) – Output format (“text”, “json”, “bibtex”)

  • include_abstract (bool) – Include abstracts in text format

Return type:

str

Returns:

Path to saved file

Examples

>>> results = search("machine learning", limit=10)
>>> results.save("results.json")
>>> results.save("results.bib", format="bibtex")
>>> results.save("results.txt", format="text")
__init__(works, total, query, elapsed_ms)

Config