bibliometa.graph package

Submodules

bibliometa.graph.analysis module

This module provides functions for analysing graphs.

class bibliometa.graph.analysis.GraphAnalysis(**kwargs)[source]

Bases: bibliometa.configuration.BibliometaConfiguration

The GraphAnalysis allows to configure and run the graph analysis of a graph corpus.

It extends the abstract BibliometaConfiguration class.

get_config(*args)

Get configuration. If no args given, the full configuration is returned. Otherwise, only the configuration parameters given in args are returned.

Returns

The calling instance if no args given, else a Config object

Return type

bibliometa.configuration.BibliometaConfiguration or bibliometa.configuration.Config

set_config(**kwargs)

Set configuration for key-value pairs given in kwargs.

Returns

The calling instance

Return type

bibliometa.configuration.BibliometaConfiguration

start()[source]

Start the analysis.

bibliometa.graph.conversion module

This module provides a class for converting a JSON file to an edge list.

class bibliometa.graph.conversion.GraphCorpus[source]

Bases: object

The GraphCorpus provides a static function to create a graph corpus in JSON format. It is needed in the conversion from JSON to an edge list representation.

static create(data, config)[source]

Create a graph corpus.

Parameters
  • data (dict) – Dictionary containing data sets

  • config (bibliometa.configuration.Config) – Configuration object

Returns

Graph corpus

Return type

dict

Raises

FileNotFoundError – If graph corpus can not be written to file

class bibliometa.graph.conversion.JSON2EdgeList(**kwargs)[source]

Bases: bibliometa.configuration.BibliometaConfiguration

The JSON2EdgeList allows to configure and run the conversion from an input JSON file to an edge list graph representation.

It extends the abstract BibliometaConfiguration class.

get_config(*args)

Get configuration. If no args given, the full configuration is returned. Otherwise, only the configuration parameters given in args are returned.

Returns

The calling instance if no args given, else a Config object

Return type

bibliometa.configuration.BibliometaConfiguration or bibliometa.configuration.Config

set_config(**kwargs)

Set configuration for key-value pairs given in kwargs.

Returns

The calling instance

Return type

bibliometa.configuration.BibliometaConfiguration

start(n=5)[source]

Start the conversion.

Parameters

n (int) – Number that indicates how many elements will be shown in data preview when verbose == True

Raises

FileNotFoundError – If file given in self.config.i can not be found.

bibliometa.graph.similarity module

This module provides a class for similarity function definitions and similarity calculations.

class bibliometa.graph.similarity.Similarity[source]

Bases: object

The Similarity provides functions to define and calculate different types of similarity.

class Functions[source]

Bases: object

This class contains predefined similarity functions.

static jaccard(a, b, f, t=0)[source]

The Jaccard Index. a and b are considered similar if the size of their intersection divided by their union is greater than or equal to t.

Parameters
  • a (set) – Set of values for item a

  • b (set) – Set of values for item b

  • f (function or int) – This value (or the result of this function) will be returned if similarity between a and b >= t

  • t (int) – Threshold

Returns

Similarity value

Return type

float or int

Raises

ValueError – If f is neither a function nor an int or float

static mint(a, b, f, t=0)[source]

a and b are considered similar if the size of their intersection is greater than or equal to t.

Parameters
  • a (set) – Set of values for item a

  • b (set) – Set of values for item b

  • f (function or int) – This value (or the result of this function) will be returned if similarity between a and b >= t

  • t (int) – Threshold

Returns

Similarity value

Return type

float or int

Raises

ValueError – If f is neither a function nor an int or float

static overlap(a, b, f, t=0)[source]

The overlap score. a and b are considered similar if the size of their intersection divided by the minimum set length of a and b is greater than or equal to t.

Parameters
  • a (set) – Set of values for item a

  • b (set) – Set of values for item b

  • f (function or int) – This value (or the result of this function) will be returned if similarity between a and b >= t

  • t (int) – Threshold

Returns

Similarity value

Return type

float or int

Raises

ValueError – If f is neither a function nor an int or float

static calculate(corpus, config)[source]

Calculate similarity between data sets.

Parameters
  • corpus (dict) – Graph corpus that contains the data on which similarity calculation will be based

  • config (bibliometa.configuration.Config) – Configuration object

bibliometa.graph.utils module

This module provides utility functions for graphs.

bibliometa.graph.utils.add_nodes_from_graph_corpus(graph, corpus, singletons=False, encoding='utf-8')[source]

Add nodes from a graph corpus file to a graph if not yet existent.

Parameters
  • graph (nx.Graph) – Graph object

  • corpus (str) – Path to graph corpus file

  • singletons (bool) – If only those nodes with no edges will be returned

  • encoding (str) – File encoding

Returns

Updated Graph object

Return type

nx.Graph

bibliometa.graph.utils.create_pos(graph, df, keys_labels, extent, verbose)[source]

Create dictionary with node positions.

Parameters
  • graph (networkx.Graph) – Graph object

  • df (pandas.DataFrame) – DataFrame with geographical information

  • keys_labels (str) – Column in DataFrame

  • extent (list) – Extent of map

  • verbose (bool) – Verbose parameter

Returns

Dictionary of positions

Return type

dict

bibliometa.graph.utils.get_graph_attributes(graph)[source]

Get degrees, labels and sizes for a graph.

Parameters

graph (networkx.Graph) – Graph object

Returns

Dictionary with degrees, labels, sizes

Return type

dict

bibliometa.graph.utils.get_nodes(graph, config, encoding='utf-8')[source]

Get nodes and their degrees from a graph.

Parameters
  • graph (networkx.Graph) – Graph object

  • config (bibliometa.configuration.Config) – Configuration object

  • encoding (str) – File encoding

Returns

Dictionary of nodes with their degrees

Return type

dict

bibliometa.graph.utils.get_subgraph(graph)[source]

Get largest connected component from graph.

Parameters

graph (networkx.Graph) – Graph object

Returns

Largest component

Return type

networkx.Graph

bibliometa.graph.utils.load_graph(config, reload=False)[source]

Load a graph from a GraphML or similarity file.

Parameters
  • config (bibliometa.configuration.Config) – Configuration object

  • reload (bool) – If graph will be loaded directly from similarity file

Returns

Graph

Return type

networkx.Graph

bibliometa.graph.utils.read_file(config, encoding='utf-8')[source]

Create graph by reading in a similarity file.

Parameters
  • config (bibliometa.configuration.Config) – Configuration object

  • encoding (str) – File encoding

Returns

Graph

Return type

networkx.Graph

bibliometa.graph.utils.save_file(path, results, encoding='utf-8')[source]

Save results in a text file.

Parameters
  • path (str) – Path to output file

  • results (list) – List containing results from graph analysis

  • encoding (str) – Encoding of file

bibliometa.graph.utils.update_graph(graph, df, col)[source]

Remove nodes from graph that do not appear in column col in DataFrame df.

Parameters
  • graph (networkx.Graph) – Graph object

  • df (pandas.DataFrame) – DataFrame

  • col (str) – Column in DataFrame

Returns

Updated Graph object

Return type

networkx.Graph

Module contents

Graph subpackage for Bibliometa.