get_df_unique_prefixes
- get_df_unique_prefixes(df: DataframeOrSeries, *, column: str | int | None = None, validate: bool = False, converter: Converter | None = None) set[str][source]
Get unique prefixes.
- Parameters:
df – A dataframe or series. If a dataframe is given, the
columnmust not be none.column – The column to check, if a dataframe was passed. If a series was passed, this can be left as none.
validate – Should the prefixes be validated against the converter?
converter – A converter for validating CURIEs
- Returns:
A set of prefixes appearing in CURIEs in the given column
import pandas as pd from curies.dataframe import get_df_unique_prefixes rows = [ ("DOID:0080795", "skos:exactMatch", "EFO:0003029", "semapv:ManualMappingCuration"), ("DOID:0080795", "skos:exactMatch", "mesh:D015471", "semapv:ManualMappingCuration"), ("DOID:0080799", "skos:exactMatch", "EFO:1000527", "semapv:ManualMappingCuration"), ( "DOID:0080808", "skos:exactMatch", "mesh:D000069295", "semapv:ManualMappingCuration", ), ] df = pd.DataFrame( rows, columns=["subject_id", "predicate_id", "object_id", "mapping_justification"] ) assert get_df_unique_prefixes(df, column="object_id") == {"EFO", "mesh"}