Skip to content

Commit

Permalink
Geocode Documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
Ilkka-LBL committed Oct 30, 2024
1 parent e0a17db commit c517ca5
Show file tree
Hide file tree
Showing 3 changed files with 35 additions and 39 deletions.
11 changes: 1 addition & 10 deletions Consensus/EsriConnector.py
Original file line number Diff line number Diff line change
Expand Up @@ -178,16 +178,7 @@ async def service_metadata(self, session: aiohttp.ClientSession) -> Dict[str, An
Returns:
Dict[str, Any]: The metadata as a JSON object.
"""

"""if not self.layers and not self.tables:
service_url = f"{self.url}/0?f=json"
elif self.layers and not self.tables:
service_url = f"{self.url}/{self.layers[0]['id']}?f=json"
elif self.tables and not self.layers:
service_url = f"{self.url}/{self.tables[0]['id']}?f=json"
else:
service_url = f"{self.url}/0?f=json"
print(service_url)"""

if self.layers:
metadata_urls = [f"{self.url}/{layer['id']}?f=json" for layer in self.layers]
elif self.tables:
Expand Down
54 changes: 29 additions & 25 deletions Consensus/GeocodeMerger.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,44 +9,48 @@
The end result is not by any means perfect and you are advised to try different paths and to check that the output makes sense.
Usage:
------
This class works as follows.
This class works as follows.
Internally, on initialising the class with `await SmartLinker().initialise`, a json lookup file of the available tables in Open Geography Portal is read if the json file exists or created if it is not available.
Then, using the information contained in the json file, a graph of connections between table columns is created using the `run_graph()` method. At this point the user provides the names of the starting and ending columns,
an optional list of `geographic_areas` and an optional list of columns for the `geographic_area_columns` that the geographic_areas uses to create a subset of data.
Internally, on initialising the class with `await SmartLinker().initialise`, a json lookup file of the available tables in Open Geography Portal is read if the json file exists or created if it is not available.
Then, using the information contained in the json file, a graph of connections between table columns is created using the `run_graph()` method. At this point the user provides the names of the starting and ending columns,
an optional list of `geographic_areas` and an optional list of columns for the `geographic_area_columns` that the geographic_areas uses to create a subset of data.
Following the creation of the graph, all possible starting points are searched for (i.e., which tables contain the user-provided starting_table). After this, we look for the shortest paths to the ending column.
To do this, we look for all possible paths from all starting_columns to ending_columns and count how many steps there are between each table.
The `run_graph()` method prints out a numbered list of possible paths.
Following the creation of the graph, all possible starting points are searched for (i.e., which tables contain the user-provided starting_table). After this, we look for the shortest paths to the ending column.
To do this, we look for all possible paths from all starting_columns to ending_columns and count how many steps there are between each table.
The `run_graph()` method prints out a numbered list of possible paths.
The user can get their chosen data using the `geodata()` method by providing an integer matching their chosen path to the `selected_path` argument.
The user can get their chosen data using the `geodata()` method by providing an integer matching their chosen path to the `selected_path` argument.
The intended workflow is:
=========================
Intended workflow
^^^^^^^^^^^^^^^^^
First explore the possible geographies.
.. code-block:: python
First explore the possible geographies.
from Consensus import SmartLinker, GeoHelper
.. code-block:: python
gh = GeoHelper()
print(gh.geography_keys())
from Consensus import SmartLinker, GeoHelper
print(gh.available_geographies())
gh = GeoHelper()
print(gh.geography_keys())
print(gh.geographies_filter('WD')) # outputs all columns referring to wards.
print(gh.available_geographies())
print(gh.geographies_filter('WD')) # outputs all columns referring to wards.
Once you've decided you want to look at 2022 wards, you can do the following:
.. code-block:: python
Once you've decided you want to look at 2022 wards, you can do the following:
.. code-block:: python
gss = SmartLinker()
gss.allow_geometry('geometry_only') # use this method to restrict the graph search space to tables with geometry
gss = SmartLinker()
gss.allow_geometry('geometry_only') # use this method to restrict the graph search space to tables with geometry
await gss.initialise()
gss.run_graph(starting_column='WD22CD', ending_column='LAD22CD', geographic_areas=['Lewisham', 'Southwark'], geographic_area_columns=['LAD22NM']) # the starting and ending columns should end in CD
codes = await gss.geodata(selected_path=9, chunk_size=50) # the selected path is the ninth in the list of potential paths output by `run_graph()` method. Increase chunk_size if your download is slow and try decreasing it if you are being throttled (or encounter weird errors).
print(codes['table_date'][0]) # the output is a dictionary of {'path': [[table1_of_path_1, table2_of_path1], [table1_of_path2, table2_of_path2]], 'table_data':[data_for_path1, data_for_path2]}
await gss.initialise()
gss.run_graph(starting_column='WD22CD', ending_column='LAD22CD', geographic_areas=['Lewisham', 'Southwark'], geographic_area_columns=['LAD22NM']) # the starting and ending columns should end in CD
codes = await gss.geodata(selected_path=9, chunk_size=50) # the selected path is the ninth in the list of potential paths output by `run_graph()` method. Increase chunk_size if your download is slow and try decreasing it if you are being throttled (or encounter weird errors).
print(codes['table_date'][0]) # the output is a dictionary of {'path': [[table1_of_path_1, table2_of_path1], [table1_of_path2, table2_of_path2]], 'table_data':[data_for_path1, data_for_path2]}
"""
import pandas as pd
import json
Expand Down
9 changes: 5 additions & 4 deletions tests/test_ConfigManager.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,10 @@

class TestConfigManager(unittest.TestCase):
def setUp(self) -> None:
self.conf_dict = {"nomis_api_key":"xxx",
"proxies.http": "proxy",
"proxies.https": "proxy"}
self.conf_dict = {"nomis_api_key": "xxx",
"proxies.http": "proxy",
"proxies.https": "proxy"}

self.conf = ConfigManager()
self.conf.default_config = self.conf_dict

Expand All @@ -31,5 +31,6 @@ def test_update(self) -> None:
self.assertEqual(updated_config['nomis_api_key'], "xxx")
self.assertNotEqual(loaded_config, updated_config)


if __name__ == '__main__':
unittest.main()

0 comments on commit c517ca5

Please sign in to comment.