2.3. Pandas Read JSON

2.3.1. Rationale

  • File paths works also with URLs

2.3.2. Compressed

  • If the extension is .gz, .bz2, .zip, and .xz, the corresponding compression method is automatically selected

df = pd.read_json('sample_file.gz', compression='infer')

2.3.3. Assignments

Code 2.52. Solution
"""
* Assignment: Pandas Read JSON
* Complexity: easy
* Lines of code: 1 lines
* Time: 3 min

English:
    1. Read data from `DATA` as `result: pd.DataFrame`
    2. Run doctests - all must succeed

Polish:
    1. Wczytaj dane z DATA jako result: pd.DataFrame
    2. Uruchom doctesty - wszystkie muszą się powieść

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> type(result) is pd.DataFrame
    True
    >>> len(result) > 0
    True
    >>> result.loc[[0,10,20]]
        sepalLength  sepalWidth  petalLength  petalWidth     species
    0           5.1         3.5          1.4         0.2      setosa
    10          7.0         3.2          4.7         1.4  versicolor
    20          6.3         3.3          6.0         2.5   virginica
"""

import pandas as pd

DATA = 'https://raw.githubusercontent.com/AstroMatt/book-python/master/_data/json/iris.json'


Code 2.53. Solution
"""
* Assignment: Pandas Read JSON OpenAPI
* Complexity: medium
* Lines of code: 3 lines
* Time: 5 min

English:
    1. Import `requests` module
    2. Define `resp` with result of `requests.get()` for `DATA`
    3. Define `data` with conversion of `resp` from JSON to Python dict by calling `.json()` on `resp`
    4. Define `result: pd.DataFrame` from value for key `paths` in `data` dict
    5. Run doctests - all must succeed

Polish:
    1. Zaimportuj moduł `requests`
    2. Zdefiniuj `resp` z resultatem `requests.get()` dla `DATA`
    3. Zdefiniuj `data` z przekształceniem `resp` z JSON do Python dict wywołując `.json()` na `resp`
    4. Zdefiniuj `result: pd.DataFrame` dla wartości z klucza `paths` w słowniku `data`
    5. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * `pd.DataFrame(data)`

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> type(result) is pd.DataFrame
    True
    >>> len(result) > 0
    True
    >>> list(result.index)
    ['put', 'post', 'get', 'delete']
    >>> list(result.columns)  # doctest: +NORMALIZE_WHITESPACE
    ['/pet', '/pet/findByStatus', '/pet/findByTags', '/pet/{petId}', '/pet/{petId}/uploadImage',
     '/store/inventory', '/store/order', '/store/order/{orderId}',
     '/user', '/user/createWithList', '/user/login', '/user/logout', '/user/{username}']
"""

import pandas as pd
import requests

DATA = 'https://raw.githubusercontent.com/AstroMatt/book-python/master/_data/json/openapi.json'


resp = ...
data = ...
result = ...