AIPM

részletek a world doksiban

IMPLEMENTATION: https://fullstackdeeplearning.com/march2019

Elmélet: http://d2l.ai/chapter_optimization/adagrad.html

Overview: https://www.youtube.com/watch?time_continue=62&v=Rb2GChTYR5Y&feature=emb_logo

Step 1: Project Statement

https://www.youtube.com/watch?time_continue=78&v=5eaza1YTGv4&feature=emb_logo

Breaking down the Problem

https://www.youtube.com/watch?time_continue=265&v=0ew7NYl7z4Y&feature=emb_logo

METRICS!

https://www.youtube.com/watch?time_continue=30&v=D3AMs0HtYWs&feature=emb_logo

Need AI? (Value of AI/ML)

https://www.youtube.com/watch?time_continue=81&v=OHG7GBIye3c&feature=emb_logo

https://www.youtube.com/watch?time_continue=185&v=MUHaV7gzRsE&feature=emb_logo

SUMMARY

https://www.youtube.com/watch?time_continue=155&v=fLPz48A7TSc&feature=emb_logo

TEAM

https://www.youtube.com/watch?time_continue=27&v=6xqxzhPkysw&feature=emb_logo

Step 2: DATA

https://www.youtube.com/watch?time_continue=120&v=z7A44YnJqCw&feature=emb_logo

Data size: enough data?

Precision & Recall

Precision and recall are just different metrics for measuring the "success" or performance of a trained model.

precision is defined as the number of true positives (truly fraudulent transaction data, in this case) over all positives, and will be the higher when the amount of false positives is low.
recall is defined as the number of true positives over true positives plus false negatives and will be higher when the number of false negatives is low.

Both take into account true positives and will be higher for high, positive accuracy, too.

I find it helpful to look at the below image to wrap my head around these measurements:

Data Annotation

If you look at different datasets used for training machine learning models; they often come in a tabular format—a file that contains a bunch of information about different data points (often a .csv spreadsheet). An example showing both the distribution—how many data points fall into which column ranges—and the different features, such as petal length and width, of different species of Iris, is shown below.

Adding Annotations via a Platform

To annotate a new data source that perhaps only includes images of flowers and no other identifying labels or features, you'll have to a data annotation platform. These platforms will send unlabeled data to some human annotators who can classify or provide features for the data and send it back to you in a tabular format. Some cloud service providers like AWS provide data annotation services as do specific companies; data annotation tooling is what the company, Figure Eight does and so we will use their platform as an example, but the skills you learn here about designing labels and creating a dataset will be applicable, across different platforms.

Figure Eight's Platform

The best way to learn about Figure Eight's data annotation tools is to explore the platform homepage. Here, you will see examples of use cases for labeling text, speech, image data, and more!

The goal of data annotation is to bring you from unstructured, unlabeled data, to a desired, labeled output. Figure Eight will send your data to human annotators that can help transform unlabeled data.

From unstructured to structured data

DATA ANNOTATION

You should design a data annotation job, such that a non-expert can identify more noticeable cases of pneumonia. Since you are designing for a non-expert annotator, you should design for failure; this means including some way to capture uncertainty in your data labels and test questions.

Project Proposal

<Your Name Here>

Data Labeling Approach

Test Questions & Quality Assurance

Say you’ve run a test launch and gotten back results from your annotators; the instructions and test questions are rated below 3.5, what areas of your Instruction document would you try to improve (Examples, Test Questions, etc.)