b7990885d8b26b9404fd9ce952b0b2f005019594,california_housing/feature_engineering.py,,,#,23
Before Change
//make a stratified split of the data
split = StratifiedShuffleSplit(n_splits=1, test_size=0.2, random_state=42)
for train_index, test_index in split.split(housing, housing["income_cat"]):
train_set = housing.loc[train_index]
test_set = housing.loc[test_index]
for set_ in (train_set, test_set):
set_.drop("income_cat", axis=1, inplace=True)
gc.collect()
//////////
// plot data
After Change
//example below
newport_ri = (41.49008, -71.312796)
cleveland_oh = (41.499498, -81.695391)
x = vincenty(newport_ri, cleveland_oh)
x //distance stored in km, see units on printing
print(x)
type(x.kilometers)
In pattern: SUPERPATTERN
Frequency: 3
Non-data size: 6
Instances
Project Name: CNuge/kaggle-code
Commit Name: b7990885d8b26b9404fd9ce952b0b2f005019594
Time: 2018-01-12
Author: nugentc@uoguelph.ca
File Name: california_housing/feature_engineering.py
Class Name:
Method Name:
Project Name: developmentseed/label-maker
Commit Name: 4163c4eb8cffd43b05931100fa2dd807a3b19fbb
Time: 2018-01-26
Author: geospatialanalystyi@gmail.com
File Name: examples/utils/tf_records_generation.py
Class Name:
Method Name: main
Project Name: scikit-learn-contrib/imbalanced-learn
Commit Name: 9b31677971ef20cb033e787cdaac6f639a728e05
Time: 2019-11-17
Author: redoykhan555@gmail.com
File Name: imblearn/under_sampling/_prototype_selection/_instance_hardness_threshold.py
Class Name: InstanceHardnessThreshold
Method Name: _fit_resample