Unit testing for Data Science in Python

pytest docs

API ref: https://docs.pytest.org/en/6.2.x/reference.html

Assert

Assert: 뒤의 조건이 true가 아니면 asserterror 리턴

ref: https://wikidocs.net/21050

understanding test result report

general information
test result
- F : failure (an exception is raised)
- . : passed

.F. : pass , fail, pass

information about failed tests

test types

data module
feature module
models module
unit test
- unit : small, independent piece of code

Mastering assert statements

message is raised when AssertError caused

actual = [ sth ]
expected = None
message = ( sth )

assert actual is expected, message

Don't do this

use pytest.approx()

testing for exceptions instead of return values

with pytest.raises(ValueError):
	sth

Test Driven Development (TDD)

def convert_to_int(integer_string_with_commas):
    comma_separated_parts = integer_string_with_commas.split(",")
    for i in range(len(comma_separated_parts)):
        # Write an if statement for checking missing commas
        if len(comma_separated_parts[i]) > 3:
            return None
        # Write the if statement for incorrectly placed commas
        if i != 0 and len(comma_separated_parts[i]) != 3:
            return None
    integer_string_without_commas = "".join(comma_separated_parts)
    try:
        return int(integer_string_without_commas)
    # Fill in with the correct exception for float valued argument strings
    except ValueError:
        return None

How to organize a growing set of tests?

run all the tests in the test class using node IDs : !pytest models/test_train.py::[name]
run only the previously failing test:
!pytest models/[name].py::[f name]

Expected failures and conditional skipping

conditional skipping

class
	@pytest.mark.skipif( [condition] , reason = " sth ")
    def name(args):
    
    assert ~

the command that would only show the reason for expected failures in the test result report: !pytest -rs
- both skipped test : add x

Mocking

testing funcs independently of dependencies
- pytest-mock
- unittest.mock

Mocker

# Add the correct argument to use the mocking fixture in this test
def test_on_raw_data(self, raw_and_clean_data_file, mocker):
    raw_path, clean_path = raw_and_clean_data_file
    # Replace the dependency with the bug-free mock
    convert_to_int_mock = mocker.patch("data.preprocessing_helpers.convert_to_int",
                                       side_effect=convert_to_int_bug_free)
    preprocess(raw_path, clean_path)
    # Check if preprocess() called the dependency correctly
    assert convert_to_int_mock.call_args_list == [call("1,801"), call("201,411"), call("2,002"), call("333,209"),
                                                  call("1990"),  call("782,911"), call("1,285"), call("389129")
                                                  ]
    with open(clean_path, "r") as f:
        lines = f.readlines()
    first_line = lines[0]
    assert first_line == "1801\\t201411\\n"
    second_line = lines[1]
    assert second_line == "2002\\t333209\\n"

Author And Source

이 문제에 관하여(Unit testing for Data Science in Python), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://velog.io/@jee-9/Unit-testing-for-Data-Science-in-Python

우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)

좋은 웹페이지 즐겨찾기

개발자 우수 사이트 수집

개발자가 알아야 할 필수 사이트 100선 추천 우리는 당신을 위해 100개의 자주 사용하는 개발자 학습 사이트를 정리했습니다