Parameters¶
Here is the full list of configuration parameters you can specify in a config.json
file.
- country: string
- The OSM QA Tile extract to download. The string should be a country matching a one of the options in
label_maker/countries.txt
- bounding_box: list of floats
- The bounding box to create images from. This should be given in the form:
[xmin, ymin, xmax, ymax]
as longitude and latitude values between[-180, 180]
and[-90, 90]
, respectively. Values should use the WGS84 datum, with longitude and latitude units in decimal degrees. - geojson: string
- An input file containing a GeoJSON FeatureCollection representing labels. Adding this parameter will override the values in the
country
andbounding_box
parameters. - zoom: int
- The zoom level used to create images. This functions as a rough proxy for resolution. Value should be given as an int on the interval [0, 19].
- classes: list of dicts
The training classes. Each class is defined as dict object with two required keys:
- name: string
- The class name.
- filter: list of strings
- A Mapbox GL Filter to define any vector features matching this class. Filters are applied with the standalone featureFilter from Mapbox GL JS.
- buffer: int
- Optional paramter to buffer labels in
'object-detection'
and'segmentation'
tasks by an arbitrary number of pixels. Accepts both positive and negative integers. It uses Shapely object.buffer to calculate the final geometry. You can verify that your buffer options create the desired labels by inspecting the files created indata/labels/
after running thelabel-maker labels
command. - imagery: string
Label Maker expects to receive imagery tiles that are 256 x 256 pixels. You can specific the source of the imagery with one of:
A template string for a tiled imagery service. Note that you will generally need an API key to obtain images and there may be associated costs. The above example requires a Mapbox access token. Also see OpenAerialMap for open imagery. The access token for TMS image formats can be read from an environment variable
'https://api.mapbox.com/v4/mapbox.satellite/{z}/{x}/{y}.jpg?access_token={ACCESS_TOKEN}'
or added directly the imagery string.A GeoTIFF file location. Works with local files:
'http://oin-hotosm.s3.amazonaws.com/593ede5ee407d70011386139/0/3041615b-2bdb-40c5-b834-36f580baca29.tif'
Remote files like a WMS endpoint
GetMap
request. Fill out all necessary parameters exceptbbox
which should be set as{bbox}
. Ex:'https://basemap.nationalmap.gov/arcgis/services/USGSImageryOnly/MapServer/WMSServer?SERVICE=WMS&REQUEST=GetMap&VERSION=1.1.1&LAYERS=0&STYLES=&FORMAT=image%2Fjpeg&TRANSPARENT=false&HEIGHT=256&WIDTH=256&SRS=EPSG%3A3857&BBOX={bbox}'
- http_auth: list
- Optional parameter to specify a username and password for restricted WMS services. For example,
['my_username', 'my_password']
. - background_ratio: float
- Specify how many background (or “negative”) training examples to create. Label Maker will generate
background_ratio
times the number of images matching the total number class tiles. - ml_type: string
One of
'classification'
,'object-detection'
, or'segmentation'
. This defines the output format for the final label numpy arrays (y_train
andy_test
).'classification'
- Output is an array of
len(classes) + 1
. Each array value will be either 1 or 0 based on whether it matches the class at the same index. The additional array element belongs to the background class, which will always be the first element. 'object-detection'
- Output is an array of bounding boxes of the form
[xmin, ymin, width, height, class_index]
. In this case, the values are pixel values measured from the upper left-hand corner (not latitude and longitude values). Each feature is tested against each class, so if a feature matches two or more classes, it will have the corresponding number of bounding boxes created. 'segmentation'
- Output is an array of shape
(256, 256)
with values matching the class index label at that position. The classes are applied sequentially according toconfig.json
so latter classes will be written over earlier class labels if there is overlap.
- seed: int
- Random generator seed. Optional, use to make results reproducible.
- split_vals: list
Default:
[0.8, 0.2]
Percentage of data to put in each category listed in split_names. Must be a list of floats that sum to one and match the length of
split-names
. For train, validate, and test data, a list like[0.7, 0.2, 0.1]
is suggested.- split_names: list
Default:
['train', 'test']
List of names for each subset of the data. Length of list must match length of
split_vals
.- imagery_offset: list of ints
- An optional list of integers representing the number of pixels to offset imagery. For example
[15, -5]
will move the images 15 pixels right and 5 pixels up relative to the requested tile bounds. - tms_image_format: string
- An option string that has the downloaded imagery’s format such as
.jpg
or.png
when it isn’t provided by the endpoint - over_zoom: int
- An integer greater than 0. If set for XYZ tiles, it will fetch tiles from
zoom
+over_zoom
, to create higher resolution tiles which fill out the bounds of the original zoom level. - band_indices: list
Default:
[1, 2, 3]
A list of band indices to pull from a TIF. Using the SpaceNet Roads Challenge Data as an example, you can use
[5, 3, 2, 7]
to extract the Red, Green, Blue, and NIR bands respectively.