PGM - - Pgmpy 학습
7909 단어 pgm
http://pgmpy.org/
github
https://github.com/pgmpy/pgmpy#installation
세 가지 방법 으로 설치 하 다.
Using conda:
$ conda install -c ankurankan pgmpy
Using pip:
$ pip install -r requirements.txt # or requirements-dev.txt if you want to run unittests
$ pip install pgmpy
Or for installing the latest codebase:
$ git clone https://github.com/pgmpy/pgmpy
$ cd pgmpy/
$ pip install -r requirements.txt
$ python setup.py install
문서 읽 기
예 01
###
# coding: utf-8
# In[16]:
# Starting with defining the network structure
from pgmpy.models import BayesianModel
cancer_model = BayesianModel([('Pollution', 'Cancer'),
('Smoker', 'Cancer'),
('Cancer', 'Xray'),
('Cancer', 'Dyspnoea')])
# In[17]:
# Now defining the parameters.
from pgmpy.factors.discrete import TabularCPD
cpd_poll = TabularCPD(variable='Pollution', variable_card=2,
values=[[0.9], [0.1]])
cpd_smoke = TabularCPD(variable='Smoker', variable_card=2,
values=[[0.3], [0.7]])
cpd_cancer = TabularCPD(variable='Cancer', variable_card=2,
values=[[0.03, 0.05, 0.001, 0.02],
[0.97, 0.95, 0.999, 0.98]],
evidence=['Smoker', 'Pollution'],
evidence_card=[2, 2])
cpd_xray = TabularCPD(variable='Xray', variable_card=2,
values=[[0.9, 0.2], [0.1, 0.8]],
evidence=['Cancer'], evidence_card=[2])
cpd_dysp = TabularCPD(variable='Dyspnoea', variable_card=2,
values=[[0.65, 0.3], [0.35, 0.7]],
evidence=['Cancer'], evidence_card=[2])
# In[18]:
# Associating the parameters with the model structure.
cancer_model.add_cpds(cpd_poll, cpd_smoke, cpd_cancer, cpd_xray, cpd_dysp)
# Checking if the cpds are valid for the model.
cancer_model.check_model()
# In[19]:
cancer_model.get_independencies()
예 02
#Bayesian Estimator
In [2]:
>>> import pandas as pd
>>> from pgmpy.models import BayesianModel
>>> from pgmpy.estimators import BayesianEstimator
>>> data = pd.DataFrame(data={'A': [0, 0, 1], 'B': [0, 1, 0], 'C': [1, 1, 0]})
data.head() Out[2]: A B C 0 0 0 1 1 0 1 1 2 1 0 0
In [3]:
>>> model = BayesianModel([('A', 'C'), ('B', 'C')])
>>> estimator = BayesianEstimator(model, data)
>>> cpd_C = estimator.estimate_cpd('C', prior_type="dirichlet", pseudo_counts=[1, 2])
>>> print(cpd_C)
╒══════╤══════╤══════╤══════╤════════════════════╕ │ A │ A(0) │ A(0) │ A(1) │ A(1) │ ├──────┼──────┼──────┼──────┼────────────────────┤ │ B │ B(0) │ B(1) │ B(0) │ B(1) │ ├──────┼──────┼──────┼──────┼────────────────────┤ │ C(0) │ 0.25 │ 0.25 │ 0.5 │ 0.3333333333333333 │ ├──────┼──────┼──────┼──────┼────────────────────┤ │ C(1) │ 0.75 │ 0.75 │ 0.5 │ 0.6666666666666666 │ ╘══════╧══════╧══════╧══════╧════════════════════╛
In [4]:
print(estimator.get_parameters(prior_type='BDeu', equivalent_sample_size=5))
[2) at 0x7f86a42021d0>, 2 | A:2, B:2) at 0x7f86a4202940>, 2) at 0x7f86a42026d8>]
오류 기록
Belief Propagation 을 사용 할 때 query 를 사용 하 는 과정 에서 다음 과 같은 오류 가 발생 했 습 니 다. 이 유 는 network x 버 전 문제 입 니 다. 이것 은 github 를 볼 수 있 습 니 다.
requirements.txt:
networkx==1.11
numpy==1.11.3
scipy==0.18.1
pandas==0.19.2
pyparsing==2.2
wrapt==1.10.8