Open in new window / Try shogun cloud
--- Log opened Thu Dec 22 00:00:19 2011
-!- blackburn [~blackburn@] has quit [Quit: Leaving.]01:46
-!- Ram108 [~amma@] has joined #shogun06:57
-!- Ram108 [~amma@] has quit [Ping timeout: 244 seconds]09:49
-!- Ram108 [~amma@] has joined #shogun10:09
-!- axitkhurana [~akshit@] has joined #shogun11:28
-!- axitkhurana [~akshit@] has quit [Quit: Leaving.]11:40
-!- Ram108 [~amma@] has quit [Ping timeout: 255 seconds]12:50
-!- puneetgoyal [~puneetgoy@] has joined #shogun13:47
puneetgoyalhey, what can I check in the attachment , if it can contribute to probability of the mail being a spam or a ham13:53
-!- heiko [] has joined #shogun15:16
heikosonney2k, around?15:16
sonne-workheiko: ^15:21
heikosonney2k, finally we meet :)15:21
sonne-workyes, took a while... seems like summer aka spare time is over :)15:22
heikoindeed :)15:22
heikohow is it going?15:22
sonne-workso how is it going? do you have some time now over the years?15:22
sonne-workI am basically busy all the time nowadays15:22
sonne-workbut I try to do small changes whenever possible15:23
heikowell same here, the UCL master is like crazy workload wise15:23
heikobut I have holidays until mid january now15:23
heikoI visit family and friends15:23
heikobut have some time in between15:23
heikoso it should be possible to do something in these weeks now15:24
sonne-workwould be nice15:24
sonne-workI don't know if you received my email from today but maybe you explain a bit more what you have in mind15:24
heikoyes i did15:24
heikowell basically the most annoying problem is to allocate an "empty" parameter15:25
sonne-worknot sure what that means but go on15:26
heikocurrently, if you want to load a parameter from a file15:26
heikoyou call load on the instance of the object you want to load15:26
heikobut all the memory is already allocated15:26
heikoall variables are registered etc15:26
heikobut now imagine you dont have a class instance and want to load something from memory15:26
heikoimageine a class has an SGVector as a variable15:27
heikothese normally are member variables15:27
heikoso you have to allocate a SGVector15:27
heikoand then also the actual memory in that sgvector15:27
heikoknow what I mean?15:28
heikoIt is quite hard to load data from file when you dont got the class instance to load into, in summary15:28
heikothe methods new_cont/delete_cont15:28
heikodo not work in this case15:28
sonne-workwhy don't you add load/save methods to SGVector?15:29
sonne-workthen you could just say vec.load etc15:29
sonne-workOne more thing I don't understand - when does it happen that you don't have a class instance and want to load sth?15:31
heikoThis happens when you want to migrate parameters15:32
heikoyou got a class with a parameter, but the one in the file is different15:32
heikoso you cannot load directly15:32
heikobut to migrate, you have to load somehow15:32
heikoCurrently, when you got a sgvector15:33
heikoyou have to15:33
sonne-workso you can load only basic types right?15:33
sonne-worklike scalars, or double*, int / SGVector<double>15:33
sonne-workeach of them need extra treatment though15:34
heikodata=SG_MALLOC(SGMatrix<char>, 1);15:34
sonne-workno code reuse15:34
sonne-workwhat is that data= line?15:34
heikoto be able to load data from a file you need a TParameter whichs parameters point to already allocated data15:35
heikoin case of sgmatrix, that sgmatrix is a class variable15:35
heikothats why you have to do this line before loading15:35
heikoif you dont, new_data/delete_data of TParameter fails15:35
heikobut this gets completely messy if more types come in15:36
sonne-workwhat other types?15:36
heikoeverytime this happens, a case distinction is needed, for all types.15:36
sonne-workbut I don't see a way around this15:37
sonne-workdo you?15:37
heikoit would be nice to have a method that creates a TParameter with all corresponding data allocated15:37
sonne-workI understand that SGObject based classes can be created via new_sgserializable15:38
sonne-workmaybe you could do the same thing for SGVector / SGMatrix / SGSparseMatrix / SGStringList15:39
sonne-work    CSGObject* new_sgserializable(const char* sgserializable_name,15:40
sonne-work                                        EPrimitiveType generic);15:40
sonne-workscalars still need extra treatment though15:41
sonne-workany other ideas?15:43
heikoI thought of what would be if this SGVector stuff would be done differently:15:44
heikoI find it very confusing that when registering the variables15:44
heikothat the adress of SGVector/SGMatrix are registered15:45
heikowhy not only register the array and the length15:45
heikothen the wrapper stuff would still be ok.15:45
heikobut the problems with allocating SGVector structs would not be there15:46
heikothen the new_cont / delete_cont methods would only be operating on arrays15:46
heikowhich would simplify things enoumously15:46
heikothen allocating empty structures is also more easy15:46
heikoPerhaps I am wrong with this15:46
sonne-workso what you are saying is that you would in the add(SGVector) method of parameter register the ptrs?15:47
heikoyes, just register the array and the length of the SGVector15:47
heikothen it would be completely out of  sight for the TParameter class15:47
sonne-workyes that would work15:48
heikoalso then these things would not be appearing anymore:15:48
sonne-workwhich is highly illegal anyways15:48
heiko(in new_cont)15:48
heikoyes this is dirty to the max15:48
heikoit is there because of the problem to get the array of a sgvector you dont know the type of15:49
sonne-workso to summarize: one would have the very basic types like, double, int, ..., then vector like things double*, int / and matrix/string/sparse matrix and SGObject15:49
sonne-workand we map SGVector etc to vector etc15:50
sonne-workso we have the more basic representation underneath15:50
sonne-workcertainly one way to do it15:50
heikoWhen using SGVector underneath, a lot of stuff has to be changed in the new_cont methods15:51
sonne-workthe other would be to remove all this double*, int serialization stuff and replace it with SGVector stuff15:51
heikobecause everything in the low level parameter stuff is based to work on only arrays15:51
sonne-workwhich would totally go away15:51
sonne-workI mean you could work with references only15:51
sonne-workno longer any pointers but just the object itself15:52
heikoyes, possible15:52
heikobut this means to change it at all places at once15:52
sonne-workno reference is possible with void15:52
sonne-workso it would need extra treatment...15:53
sonne-workI certainly find the other alternative easier15:53
sonne-workso I would rather go for that one for now15:53
heikothis changing the new_cont methods all over is my horror when continuing with the current approach15:53
sonne-workI understand that15:53
sonne-workif we have had SGVector etc from the beginning things would not look so bad :)15:54
heikoyes, but this is always the case when developing software :D15:54
sonne-workanyway, it also makes sense to still have the legacy double*, int serialization15:54
heikobut this is easily possible isnt it?15:55
heikothe register methods still take arrays as parameter15:55
sonne-workwell that still works as usual - so yes15:55
heikoso the sgvector register methods are only wrappers for these15:55
sonne-workI only meant if one would totally throw that away and convert to the SGVector etc stuff15:55
sonne-workthe only problem with the double* etc approach is that you need to ensure all variables in sgvector are serialized15:56
sonne-workmaybe you add some other register_serialization_varaibles() in sgvector itself15:57
sonne-workwhich is then called when you call add(SGVector) in Parameter15:57
heikooh yes, you mean this do_free stuff etc?15:57
heikook possible, but how to deserialise?15:58
sonne-workin the same way15:59
sonne-workyou have your SGObject where a SGVector vec lives15:59
heikoalso there might be problems with the names of the sub-variables in SGVector15:59
sonne-workit is registering its parameters15:59
sonne-workand you fill in the values15:59
sonne-workproblems with the names?16:00
sonne-workI guess devil is in the details again16:00
heikonevermind the names,  just thinking ...16:01
heikooif problems16:01
heikoI want to think of the migration to avoid more problems when doing this16:01
heikostill, how to load from a file without a instance16:02
heikowhen there are only arrays, its easy16:02
heikothe boolean would be a normal variable16:03
sonne-workbut you have that problem no matter where you register the parameter right ?16:03
heikowhich problem?16:04
sonne-workloading w/o object instance16:04
sonne-workI mean ok, you can load the varialbes into memory16:04
sonne-workand then do all the migration16:04
sonne-workand then later on - how does it get into the final object / SGVector?16:05
sonne-workahh no I see16:05
sonne-workno problem nevermind16:05
heikoafter the migration is complete the migrated data is copied into the actual registered array of the SGVector16:06
heikoand also the length16:06
sonne-workyes exactly16:06
heikothen the migration does not even see these wrapper types for vectors and matrices16:06
sonne-workso I don't see a problem w/ migration for now16:06
sonne-workso you could give this a try16:07
heikoOne problem might occur. people who have stored SGVectors in files would not be able to load anymore16:07
sonne-workbut you could map that back to the old vector way or?16:07
heikooh, theoretically .. yes16:08
sonne-workcase CT_VECTOR: case CT_SGVECTOR:16:09
sonne-workactually - would it even be any work?16:09
heikodont really know, but I changed some stuff to save/load SGVectors, they are handled a bit differently, will have a look16:09
sonne-workit looks like you treat them the same way like vectors and then only determined data start etc16:10
sonne-workand fill things in16:10
sonne-workso it seems like no problem16:10
sonne-workanyways - I would give this a go16:10
sonne-workso if you have time - it would be great if you could first cleanup Parameter*16:11
sonne-workand then continue with the migration16:11
heikoyes, this is the way16:11
sonne-workI will try to get buildbots to work more reliably16:11
sonne-workbut it is tough16:11
sonne-work(but we have one for cygwin and for linux now)16:11
sonne-workosx is coming16:12
heikobasically, migration works, the SGVector stuff was the thing that caused all these problems that I coulnt solve in the summer16:12
heikoah cool16:12
heikocygwin? :)16:12
sonne-worksounds great16:12
sonne-workwindows 716:12
heikoreally cool16:12
sonne-workand cygwin running there16:12
heikoI can still hear my old professor complaining that my bachelor stuff does not run on windows16:12
heiko(what they were using)16:12
sonne-workwhen we have these build bots running more reliably we can even create binary release-snapshots16:13
heikothat would also be great, especially for cygwin16:13
heikowell, ok will be off now for some sport, see you later, Ill inform you about the progress16:14
sonne-workthanks for continuing to work on this16:14
sonne-workwould be very cool to have this feature!16:14
sonne-worksee you16:14
heikosonne-work, I just looked at Parameter.h16:19
heikowhat we talked about would mean to change all the add* methods from using pointers to SG* to using copy-by-value right?16:20
heikoSo void add(SGVector<bool>* param, const char* name,const char* description="");16:20
heikoadd(SGVector<bool> param, const char* name,const char* description="");16:20
heikofor all SGVector/SGMatrix entries, right?16:21
heikowhat about the SGSparseVector SGString stuff? Is this also afected?16:24
heikoah no...the length needs to be a the real thing, not a copy, sorry16:29
sonne-workyes everything is affected16:32
sonne-workwe have SGStringList16:32
sonne-workall by reference yes16:32
sonne-workall these classes will copy their local variables in the copy constructor16:33
sonne-workso SGVector x = SGVector(foo, foo_len)16:33
sonne-workwill have x.vector == fo and x.vlen==foo_len16:33
sonne-workheiko: btw one more thing - I am trying to write some print function to improve user experience from python_modular16:34
sonne-workthat needs the list of parameters - but you return SGVector<char*> for that16:35
sonne-workbut it would be much better to use SGStringList<char> for this16:35
sonne-workI changed the code already16:35
sonne-workbut haven't committed yet16:35
sonne-workheiko: ahh and btw you can write16:35
heikowhere is the method again?16:35
sonne-workSGVector<int>(x, x_len);16:36
sonne-workSGVector<int> my_vector(x, x_len);16:36
sonne-workto declare my_vector16:36
heikoinstead of?16:36
sonne-workSGVector<int> my_vector = SGVector<int>(x, x_len);16:36
heikoah ok, thx16:37
heikomodshogun_wrap.cxx: In Funktion >>void shogun_CSGObject___setstate__(shogun::CSGObject*, PyObject*)<<:16:41
heikomodshogun_wrap.cxx:5722:13: Fehler: expected >>;<< before >>fstream<<16:41
heikomake[1]: *** [modshogun_wrap.cxx.o] Fehler 116:41
heikowhen compiling python_modular16:41
sonne-worktrying to reproduce this16:47
sonne-workdid you do a git clean -dfx before building? (will delete all files not in git)16:47
heikorebuilding ...16:55
heikoyes happens16:56
-!- Ram108 [~amma@] has joined #shogun17:05
sonne-workI just rebuild here17:15
sonne-workdoesn't happen17:15
sonne-workdid you do make install etc?17:16
sonne-workheiko: please show me the offending line17:16
heikofstream = new CSerializableAsciiFile(fname, 'r');17:18
heikocomes after:17:18
heikoif (!pickle_ascii)17:18
heiko                sg_io->message(MSG_ERROR,"SGBase.i", 346, "File contains an HDF5 stream but " \17:18
heiko                        "Shogun was not compiled with HDF5" \17:18
heiko                        " support! -  cannot load.")17:18
sonne-workahh so a problem w/ hdf517:21
sonne-worklet me check17:21
sonne-workbut that is ok17:22
sonne-workyou just don't have hdf5 installed17:22
sonne-workheiko: how does this relate to the error above?17:23
heikoI thought because of the missing ; before fstream17:24
heikobecause: modshogun_wrap.cxx:5722:13: Fehler: expected >>;<< before >>fstream<<17:24
sonne-workheiko: I meant just open modshogun_wrap.cxx in an editor17:33
sonne-workand then show me the surrounding lines17:33
heikothats what I did17:33
heikofstream = new CSerializableAsciiFile(fname, 'r');17:33
heikois the line17:33
heikoand the other stuff comes before17:34
sonne-workand before that?17:34
sonne-workahh after the print17:34
CIA-1shogun: Soeren Sonnenburg master * r5a8c07d / src/interfaces/modular/SGBase.i : add missing ';' -
sonne-work" support! -  cannot load."); instead of " support! -  cannot load.")17:35
heikokk :)17:39
shogun-buildbotbuild #381 of python_static is complete: Failure [failed configure]  Build details are at  blamelist: sonne@debian.org17:39
-!- blackburn [~blackburn@] has joined #shogun17:42
heikocompiles for me now17:43
blackburnhey what's up17:43
heikohey blackburn, sorry I was ust about to go :)17:46
blackburnheiko: when?17:46
heikosorry, i have an appointment17:47
heikowill talk to you later :) bye17:47
blackburnRam108: hi17:48
blackburnRam108: will you be available in next hour?17:48
Ram108blackburn: could u help me out with the rest of it now that my exams are over and i am free :)17:48
Ram108yeah i ll be :)17:49
blackburnRam108: yeah sure but not now17:49
Ram108sure take ur time :)17:49
blackburnjust came home and now off for some dinner :)17:49
Ram108definitely wen u r free :)17:49
Ram108take ur time :)17:49
blackburnok, I'll write to you soon17:49
-!- heiko [] has left #shogun []17:52
-!- ishaanmlhtr [~ishaan@] has joined #shogun18:07
blackburnRam108: re18:24
blackburnRam108:how can I help you?18:25
-!- Ram108 [~amma@] has quit [Ping timeout: 244 seconds]18:26
-!- Ram108 [~amma@] has joined #shogun18:35
blackburnsonne-work: sonney2k: any of sonnes around? ;)18:37
blackburnRam108: I'm available18:37
Ram108lol :)18:37
Ram108thank u :)18:37
Ram108hmmm can we get started?18:38
blackburnRam108: yeah18:39
Ram108i have made the realfeatures()18:39
Ram108now i have to get make the test and train vectors18:40
blackburnRam108: ehmm one real features for test and one for train18:41
Ram108okay.........meaning after i create the test and train arrays i ll have to call the realfeatures shogun method and pass them to it right?18:43
blackburnyes, it is just object representing data18:46
Ram108k am working on it now........ am using all the data for train and just about 5 of them for test.........18:47
puneetgoyalhey, is there any function to remove the html tags?19:26
blackburnpuneetgoyal: ehm from?19:55
Ram108it says train or test features dimension mismatch19:56
Ram108what do i have to do?19:56
blackburnRam108: I guess you have to transpose matrix before creating realfeatures19:57
Ram108oh i have used all the data u gave on that webpage for test and for train i have spooled out some of them (about 5) and used it19:58
Ram108now transpose both and feed it to the realfeatures() method right?19:58
blackburnfeature matrix should have dim rows20:00
blackburnand N cols20:00
puneetgoyalblackburn: sry, I thought my problem was in removing the html tags...but it was not.....I was removing the stop words from the payload of an email, Its working...but not correctly...I mean most of the stop words are removed but not all of them20:00
blackburnpuneetgoyal: I see, just try to find some python lib for that20:01
puneetgoyalI have searched a lot of them....its working but dont know why some problem is one of the stopword is 'the' ...its removing many of its instances in the payload...but not all of them20:02
Ram108blackburn: same error again20:05
blackburnRam108: check feature matrices sizes20:10
blackburnboth should have 4 rows20:10
Ram108yeah both does20:12
Ram108i ll use the same matrix for both test and train20:13
Ram108i ll see if that works20:13
Ram108this is the error i am getting:20:17
Ram108File "/usr/local/lib/python2.6/dist-packages/", line 20351, in train20:17
Ram108    return _modshogun.Machine_train(self, data)20:17
Ram108SystemError: [ERROR] label[-1717986918]=5.87727e-270 is not an integer20:17
blackburnyeah something is wrong with labels you use20:17
-!- heiko [] has joined #shogun20:20
heikoblackburn, around?20:20
blackburnheiko: yeah20:20
heikohow is it going?20:20
heikoblackburn, I noticed that all the python_modular tests are failing, is that also true for you?20:21
blackburnheiko: ALL?20:21
heikoyes, memerror.20:21
blackburnheiko: going fine, now some exams and other hard things with studies20:21
blackburnand you?20:21
heikoAt first I thought these were my changes20:21
blackburnlet me check20:21
heikobut I reverted and they still fail20:21
heikoI am on holidays :) relaxing at my parents and writing applications for PHD.20:22
blackburnheiko: cool :)20:23
heikobtw: I finally got a new computer with an SSD to have no more harddisk erors :)20:23
blackburnyou would have some in 5-6 years I guess ;)20:25
heikoyes, but i am using this trim stuff :)20:26
heikohowever, its fast20:26
heikoawesomely fast20:26
heikobtw shogun 1.1, also all python_modular tests fail (git branch)20:26
blackburndamn that's bad20:26
heikobut different error20:26
heikoperhaps its my installation or something20:27
blackburnheiko: compiling now20:28
heikowhich one?20:28
blackburnheiko: btw we are now using git issues20:28
heikoyes I saw it, pretty cool20:29
blackburnfeel free to open new, comment, etc20:29
heikook :)20:29
heikojust had to implement gaussian processes for my studies int matlab :)20:29
blackburnheiko: that would be awesome to have C++ one here20:29
heikoif I only had more time ... :)20:30
heikoI did not checkout data :(20:32
heikothat might be the reason for the error20:32
blackburnSegmentation fault20:32
heikoah ok20:32
heikowell, then its not my fault at least :)20:32
heikodinner is ready here, see you later!20:32
blackburnheiko: ok, see you :)20:33
blackburnI'm fucking shocked20:33
blackburnsonney2k: I guess you broke python modular20:33
Ram108blackburn: can i email u my code?20:34
blackburnRam108: feel free20:34
Ram108blackburn: i feel lost........ did u feel the same way wen u started learning this feild?20:34
blackburnRam108: some kind of20:35
Ram108sheesh i really dont kno were to start............ am not able to compile a code........ let alone making 1........ lol20:36
-!- heiko [] has quit [Read error: Operation timed out]20:36
blackburnRam108: well I could write some snippet, but tomorrow20:37
Ram108sure..... thanks.....20:38
-!- ishaanmlhtr [~ishaan@] has quit [Quit: Leaving]20:40
Ram108well were do i really get started to learn machine learning?20:42
blackburnRam108: not sure I understood Q?20:43
Ram108never mind......... lol well probably if i could get ur snippet tomorrow i could compare it with mine and fix all the errors20:45
Ram108i guess its more or less the same for all the other classifiers right?20:45
blackburnRam108: yes, it doesn't matter that classifier do you use20:46
Ram108hmmm okay thanks........ :)20:46
blackburnanyway in shogun you will have to use features,labels and etc20:46
Ram108okay meet u tomorrow then its getting late here......... lol :)20:48
Ram108gd night :)20:48
-!- Ram108 [~amma@] has quit [Quit: Ex-Chat]20:50
-!- blackburn [~blackburn@] has quit [Ping timeout: 252 seconds]20:53
-!- blackburn [~blackburn@] has joined #shogun20:56
-!- blackburn [~blackburn@] has quit [Ping timeout: 252 seconds]21:00
-!- blackburn [~blackburn@] has joined #shogun21:01
puneetgoyalblackburn: you there?21:22
blackburnpuneetgoyal: yeap21:22
puneetgoyalblackburn: you asked me to compare the array from a test mail with some other array21:23
puneetgoyal['this','is','spam'] is 1.0 to ['this','is','spam'], but 0.6667 to ['this','is','sparta']21:23
puneetgoyalbut, how do I find the list with which I have to compare it to21:23
puneetgoyaloh sry, I was having wrong results because I took only a little training I am getting somewhat good results21:27
-!- blackburn [~blackburn@] has quit [Ping timeout: 240 seconds]21:27
-!- blackburn [~blackburn@] has joined #shogun21:28
blackburnpuneetgoyal: sorry bad connection21:29
puneetgoyalblackburn: no problem, Now I am done building a dictionary counting the no. of words....using which I trained one dictionary of words...Now I am trying to compare a test dict with the trained one21:30
@sonney2kblackburn, what happens w/ python modular?22:43
@sonney2kwhat is broken?22:43
@sonney2kahh you mean the examples are failing22:44
@sonney2kyes sure I now return some short representation upon str instead of the serialized stuff22:45
@sonney2kso that is ok22:45
blackburnsonney2k: yes, I'm talking about python_modular examples22:47
CIA-1shogun: Soeren Sonnenburg master * r38cbdc3 / (src/shogun/base/SGObject.cpp src/shogun/base/SGObject.h): use SGStringList<char> instead of SGVector<char*> -
shogun-buildbotbuild #390 of r_static is complete: Failure [failed test_1]  Build details are at  blamelist: sonne@debian.org22:58
shogun-buildbotbuild #394 of octave_static is complete: Failure [failed test_1]  Build details are at  blamelist: sonne@debian.org22:58
-!- heiko [] has joined #shogun23:15
CIA-1shogun: Soeren Sonnenburg master * rcc60a77 / (5 files in 3 dirs):23:43
CIA-1shogun: Merge pull request #343 from karlnapf/master23:43
CIA-1shogun: changed internal storage of SGVector -
CIA-1shogun: Heiko Strathmann master * r2b685fe / examples/undocumented/libshogun/serialization_file_formats.cpp : fixed a typo -
shogun-buildbotbuild #395 of octave_static is complete: Success [build successful]  Build details are at
shogun-buildbotbuild #392 of r_static is complete: Success [build successful]  Build details are at
--- Log closed Fri Dec 23 00:00:19 2011