|
META TOPICPARENT |
name="MLRoad-map" |
Overview |
|
t7_f4_c3 |
08 May 2018 |
t8_f5_c3 |
29 May 2018 |
t12_f5_c3 |
07 Aug 2018 |
|
|
> > |
|
|
To Do List |
|
< < | Issues
- Why are candids from IPAC DB not found in alerts db?
- Why is cross validation performance decreasing steadily since t8
|
| Pipeline Development |
|
< < |
- Separate Galactic vs. Extra-Galactic
- Get light curve observations from reals into training [ POSTPONED - no way to get all confirmed spectroscopic reals ]
- Get an override label module (to override the majority vote)
|
> > |
- Get light curve observations from reals into training vetted
|
|
- Add cross matches from catalogs as a source (Nadia's from TNS)
- Flag boguses with light curve observations greater than 2.
- Catch variable stars with isdiffpos=True and false in the light curve
|
|
< < |
- Kill very old objects (from before Feb 5, 2018)
|
|
- Implement proper grid search over RF parameter space
|
|
< < |
- Get the testing framework into a Jupiter notebook
- Get features from Kowalski vs. IPAC? Do same query against Kowalski and see if we recoup candidateIds
- Pipeline Analysis:
- Labeled Data Test Set * RB score improvements
|
> > |
- Pipeline Analysis to automate:
|
| * Score improvement on known false positive, negatives (Ragnhild's List)
* KL divergence between training set, test set features (to find major divergence)
* Plot feature distributions on reals vs. boguses (Tiara's code) |
|
< < | * what do low RB reals look like, what do high RB bogus look like? |
> > |
-
- What do low RB reals look like, what do high RB bogus look like?
|
| * Unlabeled Data
* Score bias per features
* Report feature importance by correlated feature groups |
|
- Active Learning to Improve Training Data Selection [ Sara ] * Use active learning to discover potential batches of boguses (and reals, alternatively)
- Other Data Collection Sources (preferably automated)
|
|
< < |
-
- Find variables (bogus objects that have multiple alert packets, objects >= n_obs, objects with both positive and negative subtractions?)
|
> > |
-
- Find variables (bogus objects that have multiple alert packets, objects >= n_obs, objects with both positive and negative subtractions?)
|
|
-
- Automate cross matches from relevant catalogs (e.g., TNS)
- Improving Quality of GROWTH Marshall Feed
- ZTF objects provided in the GROWTH marshall are not necessarily spectroscopically-confirmed, they are saved. Is my list spectroscopically-confirmed? Email to Ashot and Mani
|
| Open Issues / Experiments
- Kowalski / IPAC candidate discrepancies
|
|
> > |
- Separate Galactic vs. Extra-Galactic classifiers
|
|
- Pixel Clump issues on certain x,y positions (see Ashish's email to Umaa on 6/28/18)
- Use alert data for Real-Bogus in real-time (which means access to 150 features and postage stamps)
- Deep Learning
- Known boguses: ZTF18aabtvch, ZTF18aaiafnn, ZTF18aaizvmy
|
|
< < |
Papers:
- ML Overview (response to Pub Board comments)
- RB Paper
|
| DONE
- 2018-08 Correlation between nbad and boguses Report Here
|
|
< < | |
> > |
- 2018-10 combine_labels.py has an override switch, to override the majority vote for examples that have been revetted
- 2018-11 Filter out old sources (from before Feb 5, 2018)
- 2018-11 Get features from Kowalski vs. IPAC? Kowalski packets are NOT a superset of IPAC db feats! An experiment limited to the intersection of Kowalski and IPAC db feats showed classifier performance decreased! But found workaround for DB performance issues (Frank gave me a way to get nid, rcid from a candid)
- 2018-12 Cross validation performance decreasing steadily since t8 due to persistent contamination within the GROWTH marshall feed
- 2019-01 Get the testing framework into a Jupyter notebook
|
|
-- UmaaRebbapragada - 08 Aug 2018 |
|
META FILEATTACHMENT |
attachment="2018-08-07-ZTF-Team-Meeting-RB.pptx" attr="" comment="Umaa Rebbapragada's Presentation on RB at the ZTF Team Meeting, Stockholm" date="1533732627" name="2018-08-07-ZTF-Team-Meeting-RB.pptx" path="2018-08-07-ZTF-Team-Meeting-RB.pptx" size="641834" stream="2018-08-07-ZTF-Team-Meeting-RB.pptx" user="Main.UmaaRebbapragada" version="1" |
META FILEATTACHMENT |
attachment="Nbad_analysis.pdf" attr="" comment="Analysis of nbad feature" date="1534290349" name="Nbad_analysis.pdf" path="Nbad analysis.pdf" size="704934" stream="Nbad analysis.pdf" user="Main.CharlotteWard" version="1" |
META FILEATTACHMENT |
attachment="2018-08-30-t13_f5_c3.pptx" attr="" comment="Analysis of RB version: t13_f5_c3" date="1535748841" name="2018-08-30-t13_f5_c3.pptx" path="2018-08-30-t13_f5_c3.pptx" size="243456" stream="2018-08-30-t13_f5_c3.pptx" user="Main.UmaaRebbapragada" version="1" |
|
|
> > |
META FILEATTACHMENT |
attachment="2019-01-t15_f5_c3.pptx" attr="" comment="Analysis of RB version: t15_f5_c3" date="1550599262" name="2019-01-t15_f5_c3.pptx" path="2019-01-t15_f5_c3.pptx" size="269747" user="UmaaRebbapragada" version="1" |
|