EADST

Train XGBoost Model with Pandas Input

Train XGBoost Model with Pandas Input

import warnings
warnings.filterwarnings("ignore")
import pandas as pd
import numpy as np
import xgboost as xgb
from sklearn.metrics import classification_report

train=pd.read_csv('./train.csv')
test=pd.read_csv('./test.csv')


info=pd.read_csv('info.csv')
print(info.head()) # column name
print(info.shape)
new_info = info.drop_duplicates(subset=['id']) # remove duplicate row with same id
train2=pd.merge(train, new_info[['id', 'number']], how='left', on='id').fillna(0) # merge table horizontally

train_y=train2['result']
train_x=train2.drop(columns=['uaid','result','others'])
test_id = test['id']
test_y=test['result']
test_x=test.drop(columns=['uaid','result','others'])


model = xgb.XGBClassifier()
model.fit(train_x, train_y)
train_predict_y = model.predict(train_x)
print(classification_report(train_y, train_predict_y))


result=model.predict_proba(test_x)
result=pd.concat([test_y,pd.DataFrame(result)],axis=1)
result.to_csv('./test_result.csv')
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
FastAPI FlashAttention 图形思考法 强化学习 多进程 WAN 飞书 LoRA Qwen2 Paper AI TTS Search Miniforge Bitcoin 财报 Image2Text Diagram hf DeepSeek WebCrawler Vim Safetensors Password Math Quantize 版权 Numpy Domain FP32 transformers ChatGPT ONNX Breakpoint 报税 BeautifulSoup FP64 Plate OpenAI PIP RGB BF16 GPTQ Pandas 阿里云 Pickle Streamlit Web Pytorch SPIE Data Zip LLM CV XGBoost Logo Review CAM FP16 Windows 签证 Bin Disk 顶会 Bipartite Agent VPN Interview JSON C++ Hotel Cloudreve Claude Quantization 递归学习法 RAR Food VGG-16 HuggingFace Algorithm PDF Firewall Land Conda DeepStream XML Jupyter 证件照 NLTK ModelScope TensorFlow UNIX OpenCV CC mmap Color Card API Vmess Base64 Tensor OCR Template Hungarian EXCEL Transformers Jetson Nginx 云服务器 音频 SAM Animate Translation Bert uwsgi 域名 Llama LaTeX 关于博主 VSCode GoogLeNet PyTorch Video Input CTC SQL v2ray CLAP v0.dev CUDA torchinfo Git Gemma uWSGI Permission Magnet Markdown NameSilo Anaconda Ubuntu Random Michelin Sklearn Linux Django ResNet-50 TensorRT Mixtral Qwen 多线程 Knowledge Docker diffusers Use icon Heatmap Hilton Clash Statistics CEIR git 继承 UI Rebuttal Tiktoken HaggingFace Excel LLAMA git-lfs Ptyhon Website Datetime Distillation InvalidArgumentError Proxy PyCharm 腾讯云 Tracking 公式 PDB 图标 llama.cpp Qwen2.5 QWEN GIT Pillow scipy Dataset 第一性原理 IndexTTS2 Augmentation SQLite LeetCode BTC Freesound FP8 Google tar tqdm 净利润 GGML printf Plotly Baidu CSV Github logger 搞笑 NLP MD5 YOLO 算法题 TSV Attention GPT4 SVR News Python Shortcut Crawler Paddle COCO
站点统计

本站现有博文323篇,共被浏览795736

本站已经建立2493天!

热门文章
文章归档
回到顶部