Open in new window / Try shogun cloud
--- Log opened Mon Sep 24 00:00:17 2012
-!- blackburn [~blackburn@] has quit [Quit: Leaving.]00:03
shogun-buildbot_build #111 of nightly_default is complete: Failure [failed test]  Build details are at
-!- hoijui [] has joined #shogun08:02
-!- sonne|work [~sonnenbu@] has joined #shogun08:41
-!- ptizoom [4f475132@gateway/web/freenode/ip.] has joined #shogun09:41
-!- heiko [] has joined #shogun11:44
-!- heiko1 [] has joined #shogun13:10
-!- heiko [] has quit [Ping timeout: 244 seconds]13:10
-!- heiko1 [] has quit [Remote host closed the connection]13:54
-!- blackburn [5bdfb203@gateway/web/freenode/ip.] has joined #shogun14:15
blackburnsonne|work: lua appears to recursively call typemap on my machine14:16
blackburndid that ever happen to you?14:16
-!- heiko [] has joined #shogun14:50
blackburnheiko: hey15:10
heikoblackburn hi15:10
blackburnhow are you?15:10
heikoblackburn, thanks I am fine how are you?15:10
blackburnfine but I am about to fall asleep right now :D15:11
blackburnanything new with shogun stuff you do?15:11
blackburnlast days I have changed SGSttring a little15:11
heikoI saw that, nice15:11
blackburnnow lua fails thoguh15:11
heikoI postponed the labels stuff15:12
heikosince we need to discuss that properly15:12
heikowith others15:12
blackburnafter vodka15:12
heikoswitched to making statistical tests work with streaming features15:12
blackburnto make that more productive15:12
heikolol :) yeah15:12
heikoI find it is a serious problem that we cannot do structures of SG* data so I dont want to touch that yet15:12
heikoall this effort went into serialization, would be a stupid to just drop that now15:13
blackburnheiko: are you having more time now?15:13
heikoblackburn, kind of yes15:13
blackburnmay be we could get back to kaggle?15:13
blackburnor rather say to kaggle idea15:14
heikooh yeah, tell me about it15:14
heikoI dont know if I have *that* much time, but its interesting anyways15:14
blackburnwell - what can I tell - we join a team and attack some problem15:14
blackburnand probably earn money and all hail to shogun :D15:14
heikoprobably? :)15:15
heikoI participated earlier this year15:15
heikoand got to place 150 or so15:15
heikoso no money :)15:16
blackburnnah don't be so pessimistic, there is always a chance15:16
heikothere is15:16
blackburnrecently one guy that used scikits reported that he won something15:16
heikojust my experience15:17
heikoI wont do it for the money anyways15:17
heikothere are much better ways getting to money ;)15:17
heikobut its fun!15:17
heikoand one learns something15:17
blackburnyeah it is always wrong to do anything for money :)15:17
blackburnlets talk about that once again this week15:18
blackburnI will check any competitions we could take participance in15:18
heikoah you  haven't got something particular yet?15:18
blackburnheh no, did you think I have?15:19
blackburnmy mistake :)15:19
blackburnno, nothing yet15:19
heikook so just an idea15:20
heikowell ok15:20
heikosay blackburn, do you know the streaming features a bit?15:20
blackburnyeah a bit15:20
heikoI wonder if there is the possibility to compute kernel values with them15:20
heikousing the CStreamingFeatures interface only15:20
heikowithout knowing the type of features15:21
heikoI just want to compute kernel values of features that I stream15:22
heikoand then forget15:22
blackburnit seems that no, we can't15:22
blackburnwell first of all it should be15:22
heikothats the whole point of having streaming features15:22
blackburndotfeatures not just features15:22
heikowhat do you mean?15:23
blackburndo you need some custom kernel for that?15:23
blackburnor dot-based?15:23
heikono a shogun kernel is fine15:23
heikono any type please15:23
heikoCKernel interface15:23
blackburnah so any possible shogun kernel15:23
blackburnthat's a problem15:23
heikoGaussian kernel as a working example15:24
blackburnwhy do you need that at all?15:24
heikowhen feature data is too large to fit in memory15:24
heikoI want to stream it15:24
heikoand I only need kernel values of certain pairs15:24
heikoand I want to stream these15:25
heikolarge - scale stuff15:25
blackburncurrently we pull vectors one by one15:25
heikoI only need sums of kernel values15:25
blackburnfrom features15:25
blackburnis that ok for you?15:25
heikoI would prefer some blocks but one by one is fine to start with15:25
heikoI have two streaming features15:26
heikoand need two samples from each15:26
heikoand then kernel of all combinations, that are 4 values15:26
blackburnare you sure you need streaming but not just file features?15:26
heikois there a difference?15:26
blackburnyeah streaming is not really for random access15:27
heikoI just need "one more" functionality15:27
blackburnit is rather something for online learning when examples come all the time15:27
heikoyeah thats what I need15:27
heikodo you got another idea?15:28
blackburnlet me make it more clear - do you have unknown stream of features?15:28
blackburnor just big data15:28
heikoI need this: "give me four more examples from features"15:28
blackburnthere is a difference15:28
blackburnwell I am unsure streaming is what you need still15:28
heikowhats the difference?15:28
blackburnstreaming known only one feature vector - current15:29
blackburnand it just gives you more current vectors one by one15:29
heikobut thats what I need, I just need 4 at once15:29
heikoor two from each feature instance15:29
blackburnbut why do you need streaming then?15:30
heikoI need 4 at once for many many times15:30
blackburnah wait15:31
blackburnso you have two current vectors15:31
blackburnand compute kernel values?15:31
heikothen forget these15:31
heikoand repeat15:31
blackburnah okay then it is the streaming thing15:31
blackburnbut I don't think you can handle that in general way15:31
blackburnat least currently15:32
heikoonline kernel algorithms face exactly the same problem15:32
blackburnwe have only linear algorithms15:33
blackburnor even algorithm15:33
heikoah we dont have kernel on-line stuff?15:33
heikothough so15:34
heikowhat about adding something like this15:34
heikolet me think15:35
heikomaybe add a method to CKernel15:35
heikowhich only works if underlying features are streams15:35
blackburnkernel matrix?15:36
heikoor even overload old one15:36
heikoyeah in case one wants to sample multiple features at once15:36
blackburnI do not understand that cause we are assumed to have kind of infinite examples15:36
blackburnin case of streaming features15:36
heikowhich is a single scalar in case of one sample15:36
heikoyeah but you want kernel matrices on small subsets in streaming15:36
heikoextreme case is one example from each stream -1x1 kernel matrix15:36
heikoto make it easier15:37
blackburnI don't mind to add some15:37
heikooverload CKernel::kernel15:37
blackburnno idea how should that work though15:37
heikobut then we would have to create a new subclass for each existing kernel or?15:37
blackburnyeah that's the worst ever way15:37
heikoor we pass an existing kernel to StreamingKernel15:38
heikothat is then used for evaluation15:38
heikoCStreamingKernel(CKernel* single_kernel, CStreamingFeatures* p, CStreamingFeatures* q)15:38
heikoand if you call CStreamingKernel::kernel(index_t m) you get a mxm matrix for streamed values15:39
blackburnoops I've got an email about my appliance to new job15:39
blackburnI sent my CV to some company15:40
heikocongrats then!15:41
blackburnlet me phone them to ask when can I be interviewed15:42
heikookay, good luck15:43
blackburnoops I have interview tomorrow15:48
blackburnheiko: so you want to have a method that computes mxm for first streaming vectors, right?15:53
heikoI think I have a solution:15:53
heikoCStreamingKernel as described above15:53
heikowhich you can ask for an mxn matrix on streamed samples15:54
heikoand this is computed as follows:15:54
heikoI add a get_features_bla method to CStreamingFeatures, which returns a non-streaming feature object with samples from the stream15:54
heikoand these "normal" features (e.g. dense) are then put into kernel and matrix is computed15:55
heikodo you see any problems with that?15:55
heikoblackburn, I dont see any, should work15:58
blackburnyes good idea I think15:59
blackburntwo features at the time16:00
heikoor even more (may be set by a method)16:00
heikoin case one needs blocks16:00
blackburnblocks like?16:09
heikosay 100 features at once16:12
blackburnfeatures ? or feature vectors?16:14
heikoI am currently using both terms for the same thing16:23
-!- sonne|work [~sonnenbu@] has left #shogun []16:54
heikoblackburn, I made a draft and sent a PR. Let me know what you think17:03
blackburnheiko: looks valid17:05
heikoblackburn, from my class I can then just ask for 2x2 kernel matrices on the streaming data and work with that17:05
heikowill leave it open for some time for sonney2k to have a look at it, maybe he has suggestions.17:05
-!- blackburn [5bdfb203@gateway/web/freenode/ip.] has quit [Quit: Page closed]17:10
-!- heiko [] has quit [Quit: Leaving.]18:18
-!- blackburn [~blackburn@] has joined #shogun18:50
wikingblackburn: yo19:48
wikingseems it works19:48
blackburnwiking: your code?19:54
wikingit did converge19:54
blackburnwhat is it btw?19:54
blackburnSO latent svm?19:54
wikingbased on bmrm19:54
-!- audy [] has quit [Changing host]20:12
-!- audy [~audy@unaffiliated/audy] has joined #shogun20:12
-!- ptizoom [4f475132@gateway/web/freenode/ip.] has quit [Ping timeout: 245 seconds]20:46
-!- heiko [] has joined #shogun20:54
-!- romi__ [~mizobe@] has joined #shogun20:56
-!- romi_ [~mizobe@] has quit [Ping timeout: 246 seconds]20:56
-!- CIA-47 [] has joined #shogun21:13
-!- heiko [] has quit [Ping timeout: 274 seconds]21:13
-!- hoijui [] has quit [Ping timeout: 274 seconds]21:13
-!- CIA-31 [] has quit [Ping timeout: 274 seconds]21:13
-!- hoijui [] has joined #shogun21:14
-!- heiko [] has joined #shogun21:14
-!- heiko [] has quit [Ping timeout: 241 seconds]22:14
-!- heiko [] has joined #shogun22:16
-!- naywhayare [] has joined #shogun22:55
-!- voket [26688042@gateway/web/freenode/ip.] has joined #shogun23:05
voketHey all. I am attempting to serialize a CRelaxedTree. The built-in print_serialized was hitting an infinite loop, so I am writing my own visiting function, but I am puzzled by the printed kernels -- in my two node tree, the second node has a three dimensional kernel when I expected a two dimensional one like the root. Any ideas?23:07
voketI am using a LinearKernel and calling its save method with a CAsciiFile23:07
blackburnvoket: hey.. let me check23:08
blackburnvoket: you meant save_serializable, right?23:10
voketnot on the kernel. Should I be using save_serializable instead of save()?23:10
voketblackburn: save_serializable looked like it had a thornier interface, but I can switch over. I'll try that and get back to you.23:11
blackburnvoket: well yes, the method you should rather use is save_serializable23:12
blackburnvoket: save(FILE* f) is something deprecated like actually23:13
voketblackburn: Ok. I will switch over and see if I can get it to behave. Thanks for the help.23:14
blackburnhowever we need to assure that RelaxedTree registers its parameters properly23:14
blackburnvoket: what do you need to store?23:14
voketblackburn: It looks like it is using SG_ADD in its constructor, but when I print i just get a mess of growing ascii.23:14
blackburnare you features big?23:15
blackburnwe have one problem I didn't manage to find time to solve23:15
voketblackburn: I want the trained learner from each level, so: For each node,  Left classes, right classes, support vectors, alphas, bias23:16
-!- shogun-buildbot [] has joined #shogun23:16
blackburnthe problem is that all submachines (tree nodes) should have a copy of features23:16
blackburnit can become huge..23:16
voketSubmachines get a local copy of the features? Why can't they just have a view?23:17
-!- shogun-buildbot_ [] has quit [Ping timeout: 240 seconds]23:17
blackburnvoket: no technical problem - just a thing to be improved23:18
voketAt the moment I am testing with a small toy dataset, but it's good to know that I may run into scaling issues.23:18
blackburnit is on our todo list23:19
blackburnso I hope we will fix that somehow - I don't mind to patch multiclass machines somehow for now23:19
voketOhhhh. The reason the print_serializable is infinitely looping is that children have their parent as one of their serialized parameters.23:21
voketI'll fix it and pull request23:21
blackburnthanks, we'd appreciate any improvements23:21
voketI am happy to help - one of my pull requests was merged in this weekend (aasted on github)23:22
blackburnahh so now I recognize you better :)23:22
blackburnI am lisitsyn at github23:23
voketNice to meet you. I am looking at the serialization code -- is there a reason that print is not just a save using stdout?23:27
voketAlso, is there a quick way to break this recursion loop? I could just unregister the parent parameter, but it would be better if the parameter printer just detected cycles.23:32
blackburnlet me check what print does23:32
blackburnvoket: yes we need to handle cycles there23:32
blackburnbut it is a nightmare :)23:32
-!- shogun-t1olbox [] has quit [Ping timeout: 240 seconds]23:32
--- Log closed Mon Sep 24 23:32:59 2012
--- Log opened Mon Sep 24 23:33:06 2012
-!- shogun-toolbox [] has joined #shogun23:33
-!- Irssi: #shogun: Total of 13 nicks [1 ops, 0 halfops, 0 voices, 12 normal]23:33
-!- Irssi: Join to #shogun was synced in 6 secs23:33
voketYes, huge nightmare. So long as print/save are not generating objects it could be just be a hash map by memory location stored in the CSerializableFile. But that's kind of ugly.23:35
voketIs there any way I can mark a parameter to be excluded from serialization? Also it looks like the person who wrote the tree class knew about this since they wrote their own debug_print function.23:36
blackburnwell with brute force it could be excluded for sure23:38
blackburnvoket: the author of this code is chiyuan zhang, our gsoc 2012 student23:40
blackburnyou may attempt to talk to him using mailing list actually - not sure if he has time now - he joined phd at MIT23:41
voketOk. For now I will write a workaround in my local code and then see if I can write a better workaround in general to push upstream.23:43
blackburnyeah that's what I wanted to suggest for now23:44
voketI'll only bug Chiyuan Zhang if I need to -- the first month of a PhD is a bad time to get a hold of someone.23:44
blackburnheh yeah23:45
blackburnare you applying these complicated tree methods?23:45
voketI had planned on it. Is that a mistake?23:46
blackburnno,no, I just wanted to know if it works in practice well23:47
blackburnI never tried23:47
--- Log closed Mon Sep 24 23:53:30 2012
--- Log opened Mon Sep 24 23:53:40 2012
-!- shogun-t1olbox [] has joined #shogun23:53
-!- Irssi: #shogun: Total of 14 nicks [1 ops, 0 halfops, 0 voices, 13 normal]23:53
! [freenode-info] why register and identify? your IRC nick is how people know you.
-!- Irssi: Join to #shogun was synced in 8 secs23:53
blackburnvoket: why do you stick to trees btw?23:53
-!- shogun-toolbox [] has quit [Ping timeout: 252 seconds]23:54
blackburnokay, I'm leaving now23:59
blackburnsee you23:59
voketok. thanks for your help23:59
--- Log closed Tue Sep 25 00:00:17 2012