Open in new window / Try shogun cloud
--- Log opened Mon May 14 00:00:40 2012
-!- av3ngr [av3ngr@nat/redhat/x-crzsxpnlvzddpsxs] has joined #shogun02:17
-!- abn_ [av3ngr@nat/redhat/x-fpjupbrdnhavmnjp] has joined #shogun03:30
-!- av3ngr [av3ngr@nat/redhat/x-crzsxpnlvzddpsxs] has quit [Ping timeout: 252 seconds]03:31
-!- abn_ [av3ngr@nat/redhat/x-fpjupbrdnhavmnjp] has quit [Client Quit]03:33
-!- vikram360 [~vikram360@] has quit [Ping timeout: 244 seconds]05:46
-!- vikram360 [~vikram360@] has joined #shogun05:46
-!- n4nd0 [] has joined #shogun06:27
-!- vikram360 [~vikram360@] has quit [Quit: Leaving]06:38
-!- uricamic [~uricamic@2001:718:2:1634:8cea:f88e:2be4:5ab] has joined #shogun09:00
-!- emrecelikten [~emrecelik@] has joined #shogun09:02
emreceliktenHi all09:06
n4nd0hey emrecelikten09:15
emreceliktenn4nd0: How are you today?09:16
n4nd0emrecelikten: I am fine, what about you?09:16
emreceliktenn4nd0: Fine, had a GSoC all-nighter. "Converging" into zombie mode with each passing hour :D09:16
CIA-113shogun: Soeren Sonnenburg master * r73b9eea / src/interfaces/lua_modular/swig_typemaps.i : remove left over %enddef -
-!- eric_ [2e1fd566@gateway/web/freenode/ip.] has joined #shogun09:38
eric_hi there09:38
n4nd0sonne|work: good morning! around??09:45
n4nd0Nico and I discussed the other day that it would be good to think of how doing KernelSOMachine even if it is *very very* slow09:46
n4nd0in order to introduce it in shogun I came up with two possibilities for the class hierarchy09:47
n4nd0either: CMachine <---- CSOMachine and CSOMachine has two children, CLinearSOMachine and CKernelMachine09:47
n4nd0or: CLinearMachine <---- CLinearSOMachine and CKernelMachine <---- CKernelSOMachine09:47
n4nd0I like better the first possibility, sonne|work what do you think?09:48
sonne|workn4nd0: yes first one09:48
sonne|workit matches what we do with multiclassmachines09:48
n4nd0sonne|work: good, thank you!09:49
-!- blackburn [~qdrgsm@] has joined #shogun10:11
-!- blackburn [~qdrgsm@] has quit [Quit: Leaving.]11:17
-!- emrecelikten is now known as emre-away11:29
-!- cronor [] has joined #shogun11:52
cronoris it intended that in python you have to import CSVMLightOneClass? everything else is without the C11:53
n4nd0cronor: there might probably be a line missing in interfaces/python_modular/Classifier.i or similar file11:55
cronorn4nd0: i'll check it11:55
-!- emre-away is now known as emrecelikten12:23
CIA-113shogun: Soeren Sonnenburg master * reed11fa / src/interfaces/modular/Classifier.i : remove C prefix from CSVMLightOneClass in modular interfaces -
sonne|workcronor: thx13:52
n4nd0cronor: you fixed it, cool :)13:54
cronor n4nd0 no, soeren fixed it. i wasn't sure if i should commit a one-line fix13:55
n4nd0oh, ahyhow, it got fixed13:56
cronoralthough i would like to contribute to the project. are there any open issues which can be done without too much cpp knowledge? i looked through the github issues and they seem all kind of big13:57
n4nd0cronor: if you would like to avoid cpp you could for example work on some examples in python or any other language we support of your preference13:58
cronorit's not about avoiding cpp, i just don't have very little experience and don't feel able to do a restructuring issue14:00
n4nd0all right14:00
cronor*just have little experience14:00
n4nd0I know that sonne|work was interested in expanding LDA to support multiclass14:00
n4nd0there is already code implemented in python to do that in scikits14:01
eric_hi all14:01
n4nd0I ported their QDA into shogun14:02
n4nd0so I guess that the plan here would be to substitute our current LDA by something similar to what they have there, working for multiclass directly14:02
n4nd0sonne|work: right?14:02
eric_which string kernel could you advice me to use for string features of different sizes ? and using a quite large alphabel (RAWBYTE). Thanks in advance for any hints!14:02
n4nd0eric_: hey! I am not an expert on string kernels but, isn't the normal thing that string features are of different lengths??14:03
eric_n4nd0: no in shogun most of the string kernel are implemented to compare strings of the same length14:04
n4nd0eric_: oh, my bad then :(14:05
eric_n4nd0: only few of them (maybe I am wrong..) are compatible with fetures of different size, and since I have a quite big alphabet I dont know which kernel could do the work ??14:05
n4nd0eric_: I am sorry I cannot help you, I don't know that much about different string kernels14:06
cronorn4nd0: i'll look into it tonight and see14:06
n4nd0cronor: all right, feel free to ask me around here if you need some help14:06
eric_n4nd0: do you know who implemented the available string kernels in shogun ?14:07
n4nd0eric_: I am not sure ... I'd say sonne|work is the most likely option14:07
n4nd0eric_: anyway, have you just tried testing different of them and analyse which one gives to you better results?14:08
eric_n4nd0: allright, thx, I hope he will read the logs14:08
n4nd0eric_: I guess you could do an example using your test data and just plug in one or other StringKernel and see which one performs best14:08
sonne|workn4nd0: re MC-LDA yes or even just add a multiclass variant14:09
n4nd0sonne|work: I have a bit of trouble here
n4nd0sonne|work: you see for example in CResultTest or in CStructuredApplication::get_joint_feature_representation14:10
n4nd0the return type14:10
n4nd0the one that appears there as vector14:10
n4nd0I am not sure what to use in shogun since it must be something like an SGVector14:10
n4nd0but it may work if the features are Dense, Sparse or String14:11
n4nd0sonne|work: do you know what I mean?14:11
sonne|workn4nd0: can't you use sth like
sonne|workI mean never ever explicitly use the feature representation but just define some operations that are needed?14:15
eric_sonne|work: Hi, hopping you have time to respond: what family of string kernels should I focus if I use string features of different size with a quite big alphabet ?14:16
n4nd0sonne|work: but I don't think that the joint feature representation should return something like CDotFeatures, it should return just a feature vector14:17
n4nd0sonne|work: so as I see it is, is the joint space is represented with DenseFeatures it should return a SGVector, if it is with SparseFeatures a SGSparseVector, and so on14:18
sonne|workeric_: some n-gram thing I would say... probably hashed14:18
sonne|workn4nd0: what I meant is - why is it necessary at all?14:18
sonne|workotherwise I think it is just a SGVector<float64_t>14:18
sonne|workbut a huuuuuge one14:18
n4nd0sonne|work: mm what do you mean with why is it necessary?14:19
sonne|workn4nd0: for example in SVMs you never need access to the examples x14:21
sonne|workor Phi(x)14:21
sonne|workall you need is the operations defined in dotfeatures14:21
sonne|worklike w <- w+ alpha*Phi(x)14:22
eric_sonne|work: does shogun permits such ngram hashing ?14:22
sonne|workeric_: indeed migt not for n-gram kernel...14:23
eric_sonne|work: tx. another dummy question: I have a alphabet of size=100, how basically can I do the "mapping" to match the CStringFeatures<char> to use it in the implemented string kernels in shogun ?14:29
-!- nicococo [] has joined #shogun16:29
-!- gsomix [~gsomix@] has joined #shogun16:39
gsomix#????? !16:39
gsomixwater, water everywhere16:39
-!- nicococo [] has left #shogun []16:41
-!- blackburn [~qdrgsm@] has joined #shogun16:44
-!- emrecelikten [~emrecelik@] has quit [Quit: Leaving.]16:48
gsomixblackburn, ?????16:50
n4nd0hey gsomix!17:08
n4nd0you have been working lately with SGVector and SGSparseVector right?17:08
blackburngsomix: have you finished with your array conversion?17:09
-!- uricamic [~uricamic@2001:718:2:1634:8cea:f88e:2be4:5ab] has quit [Quit: Leaving.]17:18
CIA-113shogun: Soeren Sonnenburg master * rea2e2f2 / src/interfaces/lua_modular/swig_typemaps.i : fix valgrind error in lua typemap -
-!- karlnapf [] has joined #shogun17:32
gsomixn4nd0, a little. sonney2k working with SGVector/SGMatrix/SGSparseVector now.17:38
gsomixblackburn, nope.17:38
-!- blackburn [~qdrgsm@] has quit [Quit: Leaving.]17:38
n4nd0gsomix: nico and I are wondering if we could have a type that could behave either as a SGVector or as a SGSparseVector17:40
n4nd0you know to put a method like17:40
n4nd0vector f() {}17:41
n4nd0and some implementations of f return SGVector and others SGSparseVector17:41
-!- blackburn [~blackburn@] has joined #shogun17:45
-!- blackburn [~blackburn@] has quit [Quit: Leaving.]17:52
-!- blackburn [~blackburn@] has joined #shogun17:54
gsomixn4nd0, it sounds cool. but I think you should talk with sonney2k about it. sorry17:54
blackburnn4nd0: why do you need it?17:55
n4nd0gsomix: ok, I'll ask him17:57
n4nd0look at line 10117:58
n4nd0for the joint feature vectors17:58
n4nd0i.e. the feature vectors that one do with information of the training data and the labels17:58
blackburnn4nd0: can it be sparse?17:58
n4nd0something like psi(xi, yi)17:58
n4nd0blackburn: nico said that normally dense representation is used17:59
n4nd0but that he would like to use sparse for high dimensional spaces17:59
n4nd0he seemed interested in that sparse vector provided here18:00
-!- eric_ [2e1fd566@gateway/web/freenode/ip.] has quit [Quit: Page closed]18:02
-!- eric_ [2e1fd566@gateway/web/freenode/ip.] has joined #shogun18:02
-!- blackburn [~blackburn@] has quit [Quit: Leaving.]18:04
-!- n4nd0 [] has quit [Quit: leaving]18:13
-!- eric_ [2e1fd566@gateway/web/freenode/ip.] has quit [Quit: Page closed]18:23
-!- blackburn [~blackburn@] has joined #shogun18:44
-!- cronor [] has quit [Ping timeout: 248 seconds]18:51
@sonney2kkarlnapf, around and have a bit of time?19:22
@sonney2kn4nd0 - this is the wrong(tm) approach19:22
@sonney2kkarlnapf, if so19:22
@sonney2kkarlnapf, please run make check-valgrind in the libshogun dir19:23
karlnapfsonney2k, hi there, I have a few minutes unfortunately :(19:24
karlnapfyes I know that the examples fail19:24
karlnapfI already had a look but couldnt find the error after some time19:24
karlnapfI dont really get it, for the SGVector transition, just de-activating the reference counts worked19:24
karlnapfbut for the matrices, it doesnt work19:25
@sonney2kkarlnapf, I think it is due to the way we serialize sgvector stuff19:25
@sonney2kwe need a separate way...19:25
@sonney2kbasically we have to store the refcount for these too19:25
CIA-113shogun: Heiko Strathmann master * r33e2e37 / (7 files): -some interface changes after talked to Arthur -
CIA-113shogun: Heiko Strathmann master * r2639838 / src/shogun/evaluation/CrossValidation.cpp : code cleanups -
CIA-113shogun: Heiko Strathmann master * rd2863d9 / (8 files in 2 dirs): Merge pull request #526 from karlnapf/master -
@sonney2kotherwise we have a leak at some point /double free19:26
karlnapfI already thought a bit about this19:26
@sonney2kI even don't mind if you do an incompatible change here19:27
karlnapfI am a bit afraid that this will cause trouble19:27
karlnapfwhen you save there is a certain refcount19:27
karlnapfthen when you load from another situation19:27
@sonney2kshogun 2.0 is very different from 1.0 - so we cannot really do anything about it19:27
karlnapfits not correct anymore19:27
karlnapfwell, that solves at least the migration problem :)19:28
@sonney2kkarlnapf, ok this will happen if we have external objects that are not serialized pointing to things19:28
karlnapfsonney2k, ok, lets save the refcount into the vector then19:29
karlnapfhowever, this still doesnt solve the mem-leak problem19:29
@sonney2kbut still much better than leaks in the general case19:29
@sonney2kwhy not?19:29
karlnapfat least I think19:29
karlnapfbecause if you de-activate the ref-counting19:30
karlnapfit still has to work19:30
@sonney2kyes but w/ leaks19:30
karlnapfwhy? there were no leaks before the transition19:30
@sonney2know we never use SG_FREE(vec.vector) or destroy_vector() etc19:31
karlnapfbut if I now use the system with the ref-counting deactivated19:31
karlnapfOh, I already checked that19:31
karlnapfI manually added SG_FREEs to the examples19:31
karlnapfstill leaking19:31
karlnapfthere is something more subtle going on19:31
@sonney2kit is not the correct way anyway19:32
karlnapfyes true19:32
karlnapfmmh, I really fear touching all these examples again :)19:32
@sonney2kkarlnapf, so please ping me when you write out the refcount along with the vector19:32
karlnapfbut ok19:32
karlnapfyes, will notify you.19:32
karlnapfsorry for being not so much available currently, but theres another exam tomorrow19:33
karlnapfand two next week19:33
@sonney2kno worries19:33
karlnapfbut then I almost got them all, the last two are more relaxed :)19:33
blackburn9th exam - 678 to go but heiko is relaxed now :D19:34
karlnapfwell, whats the alternative? freaking out all day is quite exhausting :D19:35
blackburnI rather joke on # of exams :)19:35
karlnapfk :)19:36
karlnapftoo many19:36
karlnapfmy brain feels so saturated19:36
karlnapfok, guys, gotta go, take care sonney2k, blackburn, bye19:36
@sonney2kblackburn, does the lua_modular stuff die on your machine too19:36
@sonney2kkarlnapf, cu! and thanks19:36
blackburnsonney2k: I have never never ever tried19:37
-!- karlnapf [] has quit [Quit: Leaving.]19:37
@sonney2kblackburn, then please try features_string_char_modular.lua19:39
blackburnsonney2k: ok need to switch system19:40
-!- blackburn [~blackburn@] has quit [Quit: Leaving.]19:40
@sonney2klook at this19:43
@sonney2k#0  0x00007ffff5f84e7c in shogun::CStringFeatures<char>::set_feature_vector (this=0x0, vector=..., num=0) at features/StringFeatures.cpp:23019:43
@sonney2k#1  0x00007ffff67c2f8b in _wrap_StringCharFeatures_set_feature_vector (L=0x62b010) at modshogun_wrap.cxx:17272119:43
@sonney2kthis === 0x019:43
@sonney2kgsomix, done with dynamicobjectarray?19:44
-!- blackburn [~qdrgsm@] has joined #shogun19:44
gsomixsonney2k, nope. I'm little busy with optics now.19:45
@sonney2kblackburn, <sonney2k> #0  0x00007ffff5f84e7c in shogun::CStringFeatures<char>::set_feature_vector (this=0x0, vector=..., num=0) at features/StringFeatures.cpp:23019:51
@sonney2k<sonney2k> #1  0x00007ffff67c2f8b in _wrap_StringCharFeatures_set_feature_vector (L=0x62b010) at19:51
@sonney2kseems like the object is dead19:51
blackburnsonney2k: cool :D19:52
blackburnsonney2k: modshogun_wrap.cxx:742:17: fatal error: lua.h: No such file or directory19:52
blackburnsonney2k: which pkg?19:52
-!- n4nd0 [] has joined #shogun19:56
n4nd0sonney2k: hi! tell me, what is the good approximation then :)?19:56
@sonney2kn4nd0, as I said19:57
@sonney2kn4nd0, don't use the feature representation19:57
@sonney2kfigure out which operations you need on Phi(x,y) and define only these19:57
@sonney2ke.g <Phi(x,y),w>19:58
@sonney2kw <- w+ alpha Phi(x,y)19:58
@sonney2kso you will never need to use Phi(x,y) *explicitly*19:58
n4nd0sonney2k: ok, I understand that19:59
n4nd0sonney2k: but what I don't see is how to considere it that way is going to save me somehow the need of defining a return type for the function19:59
@sonney2kn4nd0, that trick is very powerful... we use it in linear svms to train in million dim feature spaces w/ millions of examples w/o every computing Phi(x) explicitly20:00
@sonney2kyou don't need that function at all20:00
n4nd0aham, so no function20:00
@sonney2kn4nd0, you can provide a default one that returns a SGVector by computing w <- 0 + Phi(x,y) following the example above20:02
n4nd0sonney2k: blackburn , can you put me an example how to apply this trick in a simple case?20:05
n4nd0let's say we have to do <Phi(x,y), w>20:05
n4nd0how do we compute without computing Phi(x,y) explicitily20:06
blackburnn4nd0: if you need <Phi(x,y), w> then just provide a function that does <Phi(x,y), w>20:06
blackburnhmm it is not said that it is  always possible20:06
blackburnnot explicitly20:06
n4nd0eeeh can you elaborate there a bit? :)20:06
blackburnn4nd0: okay for example we have poly features20:07
blackburnn4nd0: in dot features we have only two required operations20:07
blackburndot and add20:08
blackburnso if we want poly features we don't need to construct it explicitly20:08
@sonney2kn4nd0, please have a look at the link I gave you with the operations in there20:09
n4nd0sonney2k: blackburn I will look at it after dinner and get back with any doubt20:09
n4nd0thank you!20:09
blackburnsonney2k: hmm what should be done to make lua work?20:11
blackburnlua: features_dense_real_modular.lua:1: module 'modshogun' not found:20:13
blackburnno field package.preload['modshogun']20:13
blackburnno file './modshogun.lua'20:13
@sonney2kp SWIG_Lua_ConvertPtr(L,1,(void**)&arg1,swig_types[878],0)20:14
@sonney2kseems like the thing is NULL - so no wonder it dies20:14
@sonney2kblackburn, look at the beginning of check.sh20:14
-!- blackburn [~qdrgsm@] has quit [Quit: Leaving.]20:28
-!- blackburn [~blackburn@] has joined #shogun20:36
-!- blackburn [~blackburn@] has quit [Quit: Leaving.]20:57
-!- blackburn [~blackburn@] has joined #shogun20:57
CIA-113shogun: Soeren Sonnenburg master * ra4985db / examples/undocumented/lua_modular/features_string_char_modular.lua : disable set_feature_vector from lua for now -
CIA-113shogun: Soeren Sonnenburg master * rcb50eab / src/interfaces/lua_modular/swig_typemaps.i : simplify lua typemaps (use sgvector & co) -
-!- puffin444 [62e3926e@gateway/web/freenode/ip.] has joined #shogun20:59
-!- ckwidmer [] has joined #shogun21:00
-!- blackburn [~blackburn@] has quit [Quit: Leaving.]21:45
-!- blackburn [~qdrgsm@] has joined #shogun21:47
-!- puffin444 [62e3926e@gateway/web/freenode/ip.] has quit [Ping timeout: 245 seconds]21:54
-!- gsomix_ [~gsomix@] has joined #shogun21:58
-!- gsomix [~gsomix@] has quit [Ping timeout: 244 seconds]22:00
-!- gsomix_ [~gsomix@] has quit [Read error: Operation timed out]22:21
n4nd0blackburn: hey22:34
blackburnn4nd0: hi22:35
n4nd0blackburn: so about the features issue22:39
n4nd0I have got the idea that what I should do then is to make a new class that inherits from CDotFeatures22:39
n4nd0I think that the operations that we need are basically those that are defined there22:40
blackburnyeah that is common case22:40
n4nd0and I think that we need a new class since the features here are computed from a feature vector of the input space and some kind of structured data22:40
n4nd0blackburn: do you think that is a good idea?22:40
blackburnn4nd0: yes probably22:41
n4nd0then I don't really think that the sparse or non sparse issue matter22:41
n4nd0I mean that there is no need to regard it separately22:42
-!- Marty28 [] has joined #shogun22:43
n4nd0hey Marty28, how is it going?22:44
Marty28Applying shogun to several datasets22:45
Marty28Has your google summer started?22:46
n4nd0nice results?22:46
Marty28Yes for easy cases22:46
n4nd0the oficial date has not come yet, but I think we are all hands on it already :D22:46
Marty28I am currently playing hide and seek with shogun22:47
n4nd0why so?22:47
Marty28Creating artificial data and letting shogun identify the features22:47
n4nd0nice, what are you using to identify the features?22:48
Marty28If the field i am in is new i cannot rely on existinf experience22:48
Marty28I have to make assumptions on what feature combonations my labels rely on22:50
Marty28Depend on22:50
Marty28E.g. Localized motifs combined with other numbers22:51
Marty28So first i have to make shogun depend on them22:52
Marty28Else later shogunwill not show me the importance of the real features22:54
Marty28I do not go for sensitivity but for the importance and usage of features as a result22:55
Marty28My boss does not like that22:55
Marty28Bioinformaticists want benchmarks22:56
-!- in3xes [~in3xes@] has joined #shogun22:56
Marty28Biologists want explanations22:56
n4nd0as we all do ;)22:57
n4nd0well ... ok, maybe not all22:57
Marty28So ideally i take features that have been shown to be important by biologists22:57
Marty28In SOME of my positive labels22:58
Marty28Then i optimize shogun for using this type of information22:59
Marty28F shogun finds my features of the biologists' examples22:59
Marty28It will also find/use features that are like these cases23:00
Marty28Candidates for research for biologists23:00
shogun-buildbotbuild #543 of lua_modular is complete: Success [build successful]  Build details are at
Marty28So i guess i have to be careful that the feature selection does not go for the trivial features masking the subtle ones23:08
Marty28So i could use real data and hide noisy features in the positives23:10
Marty28The i remove the big features and see if the small ones pop up with different methods23:11
n4nd0there may be also some of weighting for the features that could help here?23:11
n4nd0more importance to the subtle ones could make that they are not forgotten23:12
Marty28I know23:12
n4nd0all right ...23:12
Marty28Still i guess artificial data will help23:12
Marty28Also it gives me presentable results23:13
Marty28My real data is rather difficult23:13
Marty28Alsomi will check how methods will react on mixed phenomena23:14
Marty28I.e. When feature combinations a and b and c lead to +1 label23:15
Marty28E.g. 1,2,4 and 5,6,8 but not 1,6,823:16
Marty28I will see, just a master thesis23:19
n4nd0good night people23:24
-!- n4nd0 [] has quit [Quit: leaving]23:25
-!- cronor [] has joined #shogun23:42
-!- Marty28 [] has quit [Quit: Colloquy for iPad -]23:43
--- Log closed Tue May 15 00:00:40 2012