WeRateDogs @ Twitter 数据探索

WeRateDogs @ Twitter 数据探索该项目来源于Udacity Data Analyst Advanced课程的第二个项目，目标是对不同数据来源（既有数据、http数据和API json数据）进行收集评估清洗，通过可视化方法发掘数据有价值的信息。收集的数据包括推特的基本信息，以及利用神经网络针对推特图片进行的内容预测（预测图片中是什么品种的狗）。对于数据可视化一些有意思的结果，特此在这里与大家分享。 WeRateDogs发推数量变化从图中可以看出2015年11月该推主开始经营该推特账号，第二个月的发推数量达到了顶峰，平均每天超过了12条原创推特（开荒阶段着实辛苦）。在此后逐渐下降，在2016年第二季度开始就趋于平稳，平均每天约2～3条，并一直维持。 W...

Click to read more ...

May 18, 2018

Git and GitHub

Git: Commands Create Repo git init initial a folder as a new repo, tracking all modification under it .git/config configure only for this repo git clone <path> [<new dir name>] cannot create nested repo, so do check your pwd git status ...

Click to read more ...

Mar 25, 2018

Linux Command Line Basic and Shell

Go Into the Shell Environment: VirtualBox + Vagrant + Git Bash Terminal and Shell: Terminal (emulator) displays your keyboard input and the output, but itself do not know how to handle your input Shell will accept the input transferred from Terminal, run the command and then send the output to Terminal to display Default shell ...

Click to read more ...

Mar 13, 2018

Sequence Models - Deep Learning Specialization 5

deeplearning.ai by Andrew Ng on Coursera W1: Recurrent Neural Networks Building Sequence Model Notation: Model Architecture: Why standard network works not well? Inputs, outputs can be different lengths in different samples Doesn’t share features learned across different positions of text CNN learns f...

Click to read more ...

Feb 19, 2018

Machine Learning - Andrew Ng @ Coursera

Week 1: Introduction Application of ML Database mining large dataset growth of automation/web Application can’t program by hand handwriting recognition, NLP, computer vision Self-customizing programs recommendations Understanding human learning (brain, real AI) Definition of ML...

Click to read more ...

Jan 29, 2018

Python Packages for Data Science

This blog is created to record the Python packages of data science found in daily practice or reading, covering the whole process of machine learning from visualization and pre-processing to model training and deployment. This post is kept updating. Visualization Scikit-plot The quickest and easiest way to plot machine learning result, bui...

Click to read more ...

Info as List

Jan 08, 2018

15分钟创建个人博客 @ GitHub Pages

作为新博客的第一篇文章，先写写我是如何创建这个博客的。与标题不同，我花了N倍于15分钟的时间来开启这个博客，而秉持着“解决核心问题，避免额外认知负担”的思路，最终采取了这一套简单稳定的方案。这也符合接触新事物时“粗浅–深入–精炼”的认知过程。基本思路和准备条件利用GitHub Pages项目免费的博客生成系统，在GitHub Repository中建立必要的网站文件结构，最终通过StackEdit以Markdown语言撰写博文。建立并使用整个博客，我们需要完成下列几项：一个GitHub账号从Jekyll Themes中选择喜欢的样式，并复制到自己的GitHub中设置网站的基本信息创建博文用StackEdit撰写博文这里面提到了Mark...

Click to read more ...

Tutorial

Jan 05, 2018