Add CsvEpochSource and EpochEncoder enhancements #13

jpgill86 · 2018-08-10T19:55:44Z

CsvEpochSource was originally provided as a simple example of how to create a subclass of WritableEpochSource. However, it seems like it will be useful to many users of the epoch encoder, so I think it would be appropriate to migrate it into the main code.

Since Excel assumes commas are the delimiter when the file extension is CSV, the use of tabs as the delimiter in the epoch encoder example results in output files that are read incorrectly by Excel. This could be resolved by either changing the file extension to TSV or by using commas with the extension CSV. This change does the latter.

CsvEpochSource was originally implemented for the purpose of demonstrating how to create a subclass of WritableEpochSource. However, if `'name': 'animal_state'` is removed, it's fully general and could be very useful to anyone interested in saving epochs using CSVs. Consequently, it's a great candidate for adoption into the main code. This change migrates the implementation of CsvEpochSource to the main code so that users do not need to implement this subclass themselves. They are still invited to create alternative implementations of subclasses of WritableEpochSource that use different file types. Dependence of ephyviewer on pandas is avoided by requiring pandas only when CsvEpochSource is instantiated.

Accessible via inherited method get_channel_name().

jpgill86 · 2018-08-10T20:01:19Z

I also propose making WritableEpochSource into an abstract class using abc. The method save should be marked as an abstract method that must be implemented by subclasses. A new method load could also be declared as an abstract method. I'm happy to take this on if this sounds reasonable to you, @samuelgarcia.

This change allows epoch=None when creating a WritableEpochSource object. In this case, the new load() method is called to build an empty dictionary containing the appropriate keys and data types. Like the save() method, load() can be overridden in subclasses of WritableEpochSource to load epoch data from arbitrary sources. CsvEpochSource implements an example of this.

jpgill86 · 2018-08-13T17:39:14Z

I think I came up with an alternative idea that is fully backwards compatible.

Changing WritableEpochSource into an abstract class using the abc module would disallow the class to be instantiated. This would require old code that uses it to be changed, such as the existing tests in test_epochencoder.py.

Instead, I made the constructor parameter epoch optional. If None is passed, a new load() method is called to create an empty dictionary for the epoch. If the load() method is overridden in subclasses derived from WritableEpochSource, data can be loaded from arbitrary sources. This is how CsvEpochSource.load() gets called.

(1) `WritableEpochSource._clean_and_set` now discards epochs with duration shorter than 1 microsecond. (2) `CsvEpochSource.save` now rounds time and duration to the nearest microsecond before writing to file. (3) `EpochEncoder.refresh_table` now rounds start, stop, and duration to the nearest microsecond before displaying in the data table. These changes (1) prevent the creation of spurious epochs resulting from floating point arithmetic. They also improve display of floating point numbers in both (2) saved files and (3) the EpochEncoder's data table.

Algorithms in `InMemoryEpochSource.get_chunk_by_time`, `WritableEpochSource.add_epoch`, `WritableEpochSource.delete_in_between`, and `WritableEpochSource.merge_neighbors` implicitly assumed that epochs never overlap. This commit rewrites them to allow for the possibility of overlapping epochs. Added sorting to `WritableEpochSource._clean_and_set`, since these modifications were simplest when not trying to maintain temporal ordering during operations (i.e., use of `np.append` instead of `insert_item`). Removed redundant epoch deletion code from `add_epoch`. Added calls to `delete_in_between` before each `add_epoch` call in EpochEncoder so the behavior remains the same. `WritableEpochSource.fill_blank` was not modified and still implicitly assumes that epochs do not overlap.

Removed assertion in `WritableEpochSource.__init__` that epochs do not overlap. Removed code in `CsvEpochSource.load` for working around inadvertent overlap caused by floating point arithmetic problems. Added `remove_old_when_inserting_new` boolean parameter to EpochEncoder. When True (default), existing epochs are deleted when new epochs are created using shortcut keys or the region selector (this was the old behavior). When False, existing epochs are not deleted, resulting in overlapping epochs.

What the EpochEncoder does about existing epochs when a new one is created (deletes them or not) is determined by the `remove_old_when_inserting_new` parameter. With this commit, the behavior can be temporarily switched by holding the Shift key when pressing a shortcut key. If `remove_old_when_inserting_new` is True, pressing a shortcut key without the modifier key will delete epochs that overlap with the new epoch. Holding Shift will prevent this deletion and allow overlapping. The logic is inverted if `remove_old_when_inserting_new` is False (i.e., Shift can be used to force deletion).

jpgill86 · 2018-08-30T22:29:59Z

Beginning with e257e14, I've started addressing #16 by developing on top of the changes already proposed in this pull request.

So far, these changes allow epochs to be created in the EpochEncoder that overlap, and functions like Merge Neighbors work regardless of whether epochs overlap. One exception is Fill Blanks. It has not been updated yet, and for the moment it's not clear what the behavior should be. See discussion in #16.

Added new "Epoch insertion mode" button group to EpochEncoder GUI, containing radio buttons for "Mutually exclusive" and "Overlapping". Renamed global option `remove_old_when_inserting_new` to `exclusive_mode`.

Plain text labels for epochs provided in the EpochEncoder's data table are replaced with drop-down menus that allow the user to change the label of an existing epoch.

After using a drop-down menu to change an existing epoch's label and then pressing a shortcut key to insert a new one, the time would jump back to the first epoch, rather than forward one step. This was caused by `on_seek_table` triggering when the table was cleared by `refresh_table`. Interacting with a drop-down menu in the data table before this apparently caused row selection to trigger for the first row in an unexpected way, even if the drop-down menu interacted with was not in the first row.

Added a new feature that makes it easier to find epochs in the EpochEncoder's data table. When an epoch's rectangle is clicked in the plot, the corresponding row in the data table is automatically selected. More specifically, the label drop-down menu is selected, which allows the up and down arrow keys to be used to change the label quickly. Time is also moved to the start of the epoch.

Epochs would fail to delete if they started within the selected region and ended exactly at its right boundary.

In addition to selecting the corresponding row in the data table and changing the time, clicking a rectangle in the EpochEncoder plot will now update the region selection to match the epoch for easier duplication or deletion.

Changed region match feature to double-click instead of single-click since changing region might not always be desired.

samuelgarcia · 2018-09-19T10:36:39Z

ephyviewer/datasource/epochs.py

+        self._next_id = 0
+        for chan in self.all:
+            chan['id'] = np.arange(self._next_id, self._next_id + len(chan['time']))
+            self._next_id += len(chan['time'])


I already ask for it but. I don't get the 'id' goal.

ids are needed so that when get_chunk* returns a subset of the epochs, the on_rect_clicked and on_rect_doubleclicked can still uniquely identify the epoch. This allows the appropriate table row to be selected, and allows look-up of the start and end of that rectangle for self.region.setRegion.

See all this post.

samuelgarcia · 2018-09-19T10:37:54Z

ephyviewer/datasource/epochs.py


        assert self.all[0]['time'].dtype.kind=='f'
        assert self.all[0]['duration'].dtype.kind=='f'
-        assert np.all((self.times[:-1]+self.durations[:-1])<=self.times[1:])


Maybe this could be kept and test or not with an optional karg like assert_epoch_not_overlap_at_load=True

Sure, we could add that.

samuelgarcia · 2018-09-19T10:40:17Z

ephyviewer/datasource/epochs.py

+    @property
+    def id_to_ind(self):
+        return dict((id,ind) for ind,id in enumerate(self.ep_ids))
+


OK for all this setter with properties but if your final goal is multi channel EpochEncoder this will be useless no.
because each channel will in self.all[0]['label'], self.all[1]['label'], self.all[N]['label'] and so setter will be functions at the end.
Am I missing somethign ?

I agree. The multi-channel idea I described here did not come to me until these properties were already in place. If we implement true multi-channel, these properties will need to change or be removed. I haven't started working on multi-channel since I wanted to stop making changes before you had a chance to review the existing changes.

samuelgarcia · 2018-09-19T10:41:57Z

ephyviewer/datasource/epochs.py

+        keep3 = (ep_times<=t_start) & (ep_times+ep_durations>t_stop) # epochs that span the range
+        keep = keep1 | keep2 | keep3
+
+        return ep_times[keep], ep_durations[keep], ep_labels[keep], ep_ids[keep]


Why not use the parent function ?

Because get_chunk* in WritableEpochSource needed ids, but ids were not needed in InMemoryEpochSource. In fact, I originally put ids in InMemoryEpochSource and just called the parent function in WritableEpochSource as you suggest, but this broke some things, as I described in 6d5987d's commit message

samuelgarcia · 2018-09-19T10:46:32Z

ephyviewer/datasource/epochs.py

-            elif ep_times[i]<t1 and (t1<ep_stops[i]<t2):
-                #~ print 'b'
+            # if epoch starts and ends inside range, delete it
+            if ep_times[i]>=t1 and ep_stops[i]<=t2:


ep_stops[i]<=t2 became ep_stops[i]<t2.

I don't remember why it was important for me.
The general python approach is than the left limit is included (>=) and the right limit is excluded (<).

pandas apporach with .loc[t1:t2] with float index this your one. (right limit included).

Did you choose that intionally ?

Yes, it was intentional and necessary to fix a bug that existed before I began this PR. Epochs would fail to delete with the region deletion tool if they started within the selected region and ended exactly at its right boundary. None of the if clauses below this would catch that case. This case was probably not noticed before rounding of times to the nearest microsecond was added simply due to small numerical inaccuracy in floating point numbers.

samuelgarcia · 2018-09-19T11:00:16Z

Something else. I don't understand exactly the behavior of "split_epoch".
You need to select in the table and then split, isit ?

Coul we have a split on cursor and the epoch would be detected magically ?

samuelgarcia · 2018-09-19T11:02:37Z

ephyviewer/datasource/epochs.py




+class CsvEpochSource(WritableEpochSource):


OK for me.

I will add my ExcelEpochSource here after this PR because it is basically the same except:
df = pd.read_csv(self.filename, index_col=None) >>> df = pd.read_excel(self.filename, index_col=None).

Sounds great!

jpgill86 · 2018-09-19T11:47:27Z

Thank you for taking the time to review this!

briefly, what is the main purpose for this ids in epoch ?

When I added the feature to click a rectangle to select that epoch in the table, I needed a way to match rectangles to table rows. Since get_chunk* returns only a subset of the epochs and not the full list, the EpochEncoder doesn't know which epoch a rectangle belongs to. It just gets a short list of epochs that it should plot in the visible time period. These could be near the beginning of the epoch list, or near the end. The EpochEncoder doesn't know! So when it comes time to add signal connections to the rectangles so that, when clicked, the appropriate table row is selected, I needed a unique and accurate identifier for that epoch.

is the "split" when epoch overlap is done carfully ?

I tried very hard to do split carefully! I hope I succeeded.

jpgill86 · 2018-09-19T12:11:38Z

Something else. I don't understand exactly the behavior of "split_epoch".
You need to select in the table and then split, isit ?
Coul we have a split on cursor and the epoch would be detected magically ?

Splitting by clicking with the mouse cursor (perhaps right-clicking, or clicking with a key modifier) is a great idea! Why didn't I think of that? (And it wouldn't use magic, it would need to use ids, haha!) But I think the existing functionality should be retained for high-precision splitting (splitting at exact moments in time).

Yes, in it's current form, you need to select in the table first. I chose to do it this way because I wanted to allow for splitting specific epochs that overlap with others that the user may not want to split. If the time marker (vline) is positioned inside two overlapping epochs and the user wants to split just one of them, automatic detection of the user's intention is not possible. Although splitting all epochs that intersect the time marker might be a desirable behavior, I did not implement it this way since I wanted more control. In it's current form, splitting multiple epochs at the same location is easy to do if the user (1) clicks on a rectangle under the cursor to select the matching table row, (2) clicks the "Split" button, and (3) repeats for each epoch under the time marker that he or she wants to split.

jpgill86 · 2018-09-19T12:13:28Z

I think I've replied to all of your comments and questions so far. Let me know if things are still confusing or if I missed any. I understand that this is a lot to review!

jpgill86 · 2018-09-19T12:18:54Z

Based on your comments, I added a few items to the list of goals in #16.

samuelgarcia · 2018-09-19T13:18:56Z

OK.
Thank you for answering every. It is now more clear.
We or not far from a merge if you want.

2 other ideas:

I am wondering something. It is just a proposal.
Maybe it could be good to have this exclusive_mode not at params level in the EpochEncoder.
Because the user can change it whenever he wants.
It maybe would be easier to have it at EpochEncoder.init or even at source level.
So this params could not be changed on the fly.
In that case we could be easier to select the good method in get_chunk (searsorted or mask approach).
In we can force or he test for overlapping.

For the split mode. cursor did not meant with the mouse cursor but like you did but without selecting the rectangle. Maybe it could be usefull to add a key EpochEncoder.params like force_selection_before_split=True (what you did) force_selection_before_split=False cut everything at this time. Or course we would have to chosse a better keyword (easy for an english speaker).

jpgill86 · 2018-09-24T21:44:09Z

Thanks @samuelgarcia for the write access privileges!

Sorry I've been AWOL for the last several days, I had to focus on other projects.

I plan to take a close look at all your recent work on ephyviewer tomorrow.

samuelgarcia · 2018-09-25T07:27:13Z

Welcome in the small council.

jpgill86 · 2018-09-26T05:00:23Z

I am wondering something. It is just a proposal.
Maybe it could be good to have this exclusive_mode not at params level in the EpochEncoder.
Because the user can change it whenever he wants.
It maybe would be easier to have it at EpochEncoder.init or even at source level.
So this params could not be changed on the fly.
In that case we could be easier to select the good method in get_chunk (searsorted or mask approach).
In we can force or he test for overlapping.

I understand the motivation to use the more efficient search algorithm, but I think there may be such a thing as unnecessary over-optimization, haha. I tried pushing the masking method to its limits using this script:

import numpy as np
import ephyviewer

app = ephyviewer.mkQApp()
win = ephyviewer.MainViewer()

n = 1000000
epoch_source = ephyviewer.InMemoryEpochSource([{
    'time':     np.arange(n),
    'duration': np.array([0.9] * n),
    'label':    np.array(['']*n),
    'name':     ''
}])

epoch_viewer = ephyviewer.EpochViewer(source = epoch_source, name = 'epochs')
win.add_view(epoch_viewer)

win.set_xsize(30)
win.show()
app.exec_()

It wasn't until there were on the order of 1,000,000 epochs loaded in memory that the masking method became noticeably slower during random seeking (dragging the time slider back and forth) compared to the old searchsorted method on my computer. Of course, on slower systems or when many other data are loaded into ephyviewer, speed reduction due to the slower masking method might become an issue, but it seems unlikely given how large that number is.

I'd like to eventually implement the "true multi-channel" described here. This would allow us to strictly enforce the no-overlaps rule within individual channels, and it would permit the use of the better searchsorted method.

Nevertheless, if you still have concerns about performance with this PR in its current form, I think your suggestions are good for addressing them, and I could take a stab at making the changes. Let me know.

For the split mode. cursor did not meant with the mouse cursor but like you did but without selecting the rectangle. Maybe it could be usefull to add a key EpochEncoder.params like force_selection_before_split=True (what you did) force_selection_before_split=False cut everything at this time. Or course we would have to chosse a better keyword (easy for an english speaker).

Adding the ability to choose how splitting works seems reasonable. How about this as an alternative approach: I could rename the "Split" button to "Split selected epoch", and I could add another button called "Split all epochs under cursor". Would that be acceptable?

samuelgarcia · 2018-09-26T07:19:37Z

OK.Let's forget the performencee for the moment.
Could you keep commented in the code for memory the "searchsorted" thing.

I still think that exclusive_mode shoudl be in _init and not in params because in params this can be changed on th fly and will make complicated the behavior.

OK for the split porposal.

This is great, we will have a world class tools. Thank you very much.

jpgill86 · 2018-09-29T20:09:17Z

Perhaps this proposal in our other thread would address your concerns about exclusive_mode and the searchsorted algorithm?

jpgill86 · 2019-04-08T20:22:52Z

Someday I will need to revisit all this and clean it up... but for now, I'm pushing a couple fixes from my fork!

samuelgarcia · 2019-04-09T14:28:28Z

Hi Jeffrey.
happy to see again here.
I also been very busy since this summer.
I don't remember why I did not merge this at that time.
I propose to have a look again at theses 2 PR and merge this soon.
So if you need improvement you will be able to add small piece directly into teh master.
Best

jpgill86 · 2019-04-09T17:07:58Z

Hi Sam, good to hear from you! Yes, I've noticed that the Neo projects have been very busy lately with overhauls to annotations and fixes to the Blackrock IO. My inbox has been stuffed, haha!

As I recall, you suggested some alternative implementations for the features proposed here, especially relating to the new optional feature of overlapping epochs and the internal representation of multiple epoch channels. I said in #16 that I liked your suggestions and had a desire to implement them, but it would take a lot of rewriting, and I thought I might just start a new pull request. As it turned out, I didn't have much time to work on this (surprise!), so I haven't begun that rewrite.

As it is now, I use the new features in this PR regularly and find them very useful. If you think that merging soon and making incremental improvements later makes sense, that's fine with me. On the other hand, perhaps if you look closely again you will remember what you didn't like!

jpgill86 · 2019-04-09T22:28:22Z

In case those 2 pull requests weren't enough, I gave you 4 more. Don't worry, they are much smaller. 😜

You gave me write-access to the repo back in September. I can merge the new pull requests myself if you think they look sane.

samuelgarcia · 2019-05-16T18:44:35Z

I guess this ultimate and very important commit is hiden message for merging ?
I don't remember totallyy the story nor I have time to read this PR again but I trust you.
So can I merge ?

jpgill86 · 2019-05-16T18:54:58Z

Haha, I wasn't trying to steal your attention, I'm just polishing some things because I'm beginning to train some students to use the software.

I think you could merge now, and some of the things that we discussed last fall could be implemented later. I will have to read through these threads again to recall the details, but I think you had some good suggestions I wanted to try. However, I think they could be implemented as changes later, on top of these changes, rather than requiring a total rewrite before merging anything.

samuelgarcia · 2019-05-16T19:07:41Z

héhé.
thank you very much for this.
lets go.

jpgill86 added 5 commits August 10, 2018 12:39

Fixed spelling of CsvEpochSource

47b061a

Add missing color_labels param to CsvEpochSource constructor

afae788

Added channel_name param to CsvEpochSource constructor

6339817

Accessible via inherited method get_channel_name().

jpgill86 added 4 commits August 12, 2018 12:19

Raise NotImplementedError for abstract method WritableEpochSource.save()

1f0a0bf

Removed unneeded variable in CsvEpochSource

2b4adf7

Created CsvEpochSource.load() method

60315a6

jpgill86 added 9 commits August 13, 2018 13:56

Added test for empty WritableEpochSource

532743e

Updated epoch encoder example

6aadbd5

Use numbers for default EpochEncoder shortcuts

9094410

Moved EpochEncoder global options into EpochEncoder_ParamController

54cfda6

Update data table after merging neighbors in EpochEncoder

c90727c

jpgill86 changed the title ~~Add CsvEpochSource to main code~~ Add CsvEpochSource and EpochEncoder enhancements Aug 30, 2018

jpgill86 mentioned this pull request Aug 30, 2018

EpochEncoder enhancements #16

Open

19 tasks

jpgill86 added 7 commits September 1, 2018 16:19

Add EpochEncoder buttons for controlling epoch insertion mode

a921f47

Added new "Epoch insertion mode" button group to EpochEncoder GUI, containing radio buttons for "Mutually exclusive" and "Overlapping". Renamed global option `remove_old_when_inserting_new` to `exclusive_mode`.

Add editable labels to EpochEncoder data table

3ff4cb8

Plain text labels for epochs provided in the EpochEncoder's data table are replaced with drop-down menus that allow the user to change the label of an existing epoch.

Fix for failure to delete epochs in EpochEncoder

294d52c

Epochs would fail to delete if they started within the selected region and ended exactly at its right boundary.

Click rectangle to match region in EpochEncoder

04493c6

In addition to selecting the corresponding row in the data table and changing the time, clicking a rectangle in the EpochEncoder plot will now update the region selection to match the epoch for easier duplication or deletion.

Double-click rectangle to match region in EpochEncoder

14ba5dc

Changed region match feature to double-click instead of single-click since changing region might not always be desired.

samuelgarcia reviewed Sep 19, 2018

View reviewed changes

jpgill86 added 2 commits October 30, 2018 19:55

Sort before export in CsvEpochSource.save

03fbd25

Load CsvEpochSource with correct dtypes

74b7fe9

Provide a more informative save prompt for EpochEncoder

169c09d

samuelgarcia merged commit a3a07b8 into NeuralEnsemble:master May 16, 2019

jpgill86 deleted the epoch-encoder branch May 16, 2019 20:01

jpgill86 added this to the 1.1.0 milestone Jul 14, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CsvEpochSource and EpochEncoder enhancements #13

Add CsvEpochSource and EpochEncoder enhancements #13

jpgill86 commented Aug 10, 2018

jpgill86 commented Aug 10, 2018

jpgill86 commented Aug 13, 2018 •

edited

Loading

jpgill86 commented Aug 30, 2018 •

edited

Loading

samuelgarcia Sep 19, 2018

jpgill86 Sep 19, 2018

samuelgarcia Sep 19, 2018

jpgill86 Sep 19, 2018

samuelgarcia Sep 19, 2018

jpgill86 Sep 19, 2018

samuelgarcia Sep 19, 2018

jpgill86 Sep 19, 2018

samuelgarcia Sep 19, 2018

jpgill86 Sep 19, 2018

samuelgarcia commented Sep 19, 2018

samuelgarcia Sep 19, 2018

jpgill86 Sep 19, 2018

jpgill86 commented Sep 19, 2018

jpgill86 commented Sep 19, 2018

jpgill86 commented Sep 19, 2018

jpgill86 commented Sep 19, 2018

samuelgarcia commented Sep 19, 2018

jpgill86 commented Sep 24, 2018

samuelgarcia commented Sep 25, 2018 •

edited

Loading

jpgill86 commented Sep 26, 2018

samuelgarcia commented Sep 26, 2018

jpgill86 commented Sep 29, 2018

jpgill86 commented Apr 8, 2019

samuelgarcia commented Apr 9, 2019

jpgill86 commented Apr 9, 2019

jpgill86 commented Apr 9, 2019

samuelgarcia commented May 16, 2019

jpgill86 commented May 16, 2019

samuelgarcia commented May 16, 2019

Add CsvEpochSource and EpochEncoder enhancements #13

Add CsvEpochSource and EpochEncoder enhancements #13

Conversation

jpgill86 commented Aug 10, 2018

jpgill86 commented Aug 10, 2018

jpgill86 commented Aug 13, 2018 • edited Loading

jpgill86 commented Aug 30, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

samuelgarcia commented Sep 19, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jpgill86 commented Sep 19, 2018

jpgill86 commented Sep 19, 2018

jpgill86 commented Sep 19, 2018

jpgill86 commented Sep 19, 2018

samuelgarcia commented Sep 19, 2018

jpgill86 commented Sep 24, 2018

samuelgarcia commented Sep 25, 2018 • edited Loading

jpgill86 commented Sep 26, 2018

samuelgarcia commented Sep 26, 2018

jpgill86 commented Sep 29, 2018

jpgill86 commented Apr 8, 2019

samuelgarcia commented Apr 9, 2019

jpgill86 commented Apr 9, 2019

jpgill86 commented Apr 9, 2019

samuelgarcia commented May 16, 2019

jpgill86 commented May 16, 2019

samuelgarcia commented May 16, 2019

jpgill86 commented Aug 13, 2018 •

edited

Loading

jpgill86 commented Aug 30, 2018 •

edited

Loading

samuelgarcia commented Sep 25, 2018 •

edited

Loading