Skip to content
This repository has been archived by the owner on Apr 30, 2020. It is now read-only.

Graph generators for testing #21

Closed
wants to merge 2 commits into from
Closed

Conversation

mzargham
Copy link
Collaborator

Introduce simple graph generators as research infrastructure [WIP]

Graph generators are critical for:

  • writing test cases for the python page_ranker implementation
  • for exploring the way any particular page_ranker parameterization spreads cred

Sample graphs included:

  • line graph
  • circle graph
  • star graph
  • tree graph

Joint work with @BrianLitwin

Test plan:
Tests added; require further review.

contains MultDiGraph generators consistent with networkx for seourcecred reference cases
line
star
circle
tree
Graph generators are critical for:
- writing test cases for the python page_ranker implementation
- for exploring the way any particular page_ranker parameterization spreads cred

Sample graphs included:
-line graph
-circle graph
-star graph
-tree graph

Joint work with @BrianLitwin

Test plan:
Tests added; require further review.
Copy link
Contributor

@teamdandelion teamdandelion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass review. Lots of bits of Python coding style to internalize. :)

"""
Created on Sun Mar 17 13:00:33 2019

@author: Zargham
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the function of the author block? Does it represent "ownership" of the file in an abstract sense? What information does it provide outside of the information readily available from git blame? Are users who make modifications to this file expected to add themselves as authors?

Similarly, why do we have a created timestamp? In the few cases where the creation date is relevant information, it can be found by checking the timestamp on the associated commit. I don't see why humans should be manually maintaining such metadata.

If we do standardize on having explicit authorship tags, it should be the users' github handle, and not their last name.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this was autogenerated by spyder and I just sort of ignored it. If you have a specific conversion for headers, I'll happily follow it.

Copy link
Contributor

@teamdandelion teamdandelion Mar 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just have a standard python docstring describing the purpose of the module.

"""
import networkx as nx

def lineGraphGen(N, bidir=False,nodeTypeName='vanilla',edgeTypeName='vanilla'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, python style for method names is underscore naming, as in line_graph_gen

"""
import networkx as nx

def lineGraphGen(N, bidir=False,nodeTypeName='vanilla',edgeTypeName='vanilla'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: missing commas between spaces. run black on this code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ran black

"""
import networkx as nx

def lineGraphGen(N, bidir=False,nodeTypeName='vanilla',edgeTypeName='vanilla'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, arguments should have underscore naming, as in edge_type_name

"""
import networkx as nx

def lineGraphGen(N, bidir=False,nodeTypeName='vanilla',edgeTypeName='vanilla'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is edge_type_name and node_type_name needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The the python versions of the pagerank algorithm use the types to look up the edge weights. In these simple generators there is only one type of edge but i was expecting to use them for constructing more sophisticated graphs by combining these, the resulting graphs would would have multiple types.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i updated these methods so the user can assign them with a dictionary, thus making heterogenous types, like in the sourcecred graphs with real data.


class GraphGenerators_StarGraph(unittest.TestCase):

# how to validate that 'kind' is either 'source', 'sink', or 'bidir' ??
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should have the fn raise a ValueError if the kind isnt an acceptable value, then test that behavior here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added error handling (first try)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added error handling across the module.


def test_starGraph_sink(self):
g = starGraphGen(4)
self.assertEqual(list(g.edges()), [(1, 0), (2, 0), (3, 0), (4, 0)])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BrianLitwin when we were chatting i mentioned i didn't like these test cases, but now that i am reading and paying closer attention, i think they're fine. its pretty easy to read and reason about what this graph is, and whether it's the right graph for the arguments.


def test_starGraph_bidir(self):
g = starGraphGen(4, kind='bidir')
self.assertEqual(list(g.edges()), [(0, 1), (0, 2), (0, 3), (0, 4), (1, 0), (2, 0), (3, 0), (4, 0)])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is kind of a doozy to keep all in my head--maybe check for n=2 or n=1 instead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept them at mostly 3 and 4 because for 2 node graphs the there would be no way to distinguish some of the graph patterns; for example the star and the line would be the same.

"""
import networkx as nx

def lineGraphGen(N, bidir=False,nodeTypeName='vanilla',edgeTypeName='vanilla'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, all of these public methods should have python docstrings:
https://www.python.org/dev/peps/pep-0257/

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added doc string; not sure they are totally up to convention but i read the link and added simple statements about each function.

])

# this is the way I might test that the types assigned in Z's functions are correct
# I think it would be a good idea to abstract out that logic into a helper function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, agreed! see my comment above

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

factored it out. Thanks.

@mzargham
Copy link
Collaborator Author

FYI @decentralion ~ i was out of sync with master and struggling to get lined up so to avoid issues we ran into with the other rebase, I opted to make a new branch. Working on these updates here:
https://github.com/sourcecred/research/tree/graph-generators-take-2/infra

@teamdandelion
Copy link
Contributor

@mzargham cool, send a pull request my way once you want another review.

@teamdandelion
Copy link
Contributor

Closing as obsoleted by #26

@wchargin wchargin deleted the graph-generators-for-testing branch September 11, 2019 03:08
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants