Open in new window / Try shogun cloud
--- Log opened Fri Jul 05 00:00:18 2013
@sonney2kpickle27, seen this CJADiag.diagonalize00:04
@sonney2kmathematics/ajd/ Failure00:04
@sonney2kValue of: true00:04
@sonney2kExpected: isperm00:04
@sonney2kWhich is: false00:04
@sonney2k[  FAILED  ] CJADiag.diagonalize (5 ms00:04
-!- gsomix [~gsomix@] has quit [Ping timeout: 248 seconds]00:05
pickle27sonney2k: yeah I did see that00:05
pickle27sonney2k: it creates new test data each time but its never failed when I've ran it00:06
pickle27sonney2k: it should use a chi square but there wasn't an easy way to do that so I left it as gaussian which may be the problem00:06
pickle27sonney2k: but like I said its never happened to me before and it passed the other builds right00:07
pickle27sonney2k: is there an easy way to do chi squared in Shogun? there is a nice way to do it with C11...00:09
@sonney2kpickle27, maybe you didn't initialize the rng then?00:11
shogun-buildbotbuild #1318 of deb3 - modular_interfaces is complete: Failure [failed test python_modular]  Build details are at  blamelist: Soeren Sonnenburg <>00:11
@sonney2kif test data is random you will get different results all the time00:11
pickle27sonney2k: yeah but the final result from this test should always be a permutation matrix00:11
pickle27even with random input00:11
pickle27I mean it is constrained random input so that it should work00:12
@sonney2kpickle27, welcome to the wonderful world of the numerics of float/double00:12
pickle27sonney2k: haha yup00:12
@sonney2kpickle27, so just do a fixed seed00:12
@sonney2kif that works locally then it should work remotely00:13
pickle27sonney2k: okay I'll do that00:13
@sonney2kpickle27, CMath::init_random(17)00:13
pickle27sonney2k: I have another PR up for the second alg so I'll just push to that00:13
@sonney2kpickle27, I also have some code style comments00:13
@sonney2kpickle27, please do for (int i...)00:14
@sonney2kspace between for and (00:14
pickle27ah right00:14
@sonney2kand also when a for loop as more than 1 line use { }00:14
pickle27sonney2k: should be fixed up in my latest PR00:24
pickle27im testing it on my system right now00:24
pickle27sonney2k: looks like setting the random seed didn't fix the second build00:33
pickle27sonney2k: could it be a clang problem?00:34
pickle27sonney2k: my is_perm function isn't the best because the matrix will have a random scale00:34
pickle27it would help if I could reproduce on my computer, can I just apt-get clang and try?00:35
pickle27nvm gcc failed on the other one..00:36
shogun-buildbotbuild #1319 of deb3 - modular_interfaces is complete: Failure [failed test python_modular]  Build details are at  blamelist: Soeren Sonnenburg <>00:39
pickle27sonney2k: I sent a commit to print the matrix, hopefully this will help me figure out whats up!01:02
pickle27how long do travis logs stay for? I have to step out for a bit01:03
pickle27got the logs saved before I had to head out!01:12
-!- shogun-notifier- [] has quit [Quit: transmission timeout]02:25
-!- iglesiasg [] has quit [Quit: Ex-Chat]02:36
shogun-buildbotbuild #391 of nightly_none is complete: Success [build successful]  Build details are at
-!- FSCV [~FSCV@] has quit [Quit: Leaving]03:36
shogun-buildbotbuild #383 of nightly_all is complete: Success [build successful]  Build details are at
-!- zxtx_ [] has joined #shogun03:55
-!- Netsplit *.net <-> *.split quits: zxtx03:59
shogun-buildbotbuild #448 of nightly_default is complete: Failure [failed test]  Build details are at
-!- nube [~rho@] has quit [Quit: Leaving.]05:01
-!- nube [~rho@] has joined #shogun06:03
-!- nube [~rho@] has quit [Quit: Leaving.]06:43
-!- nube [~rho@] has joined #shogun06:44
-!- nube [~rho@] has quit [Quit: Leaving.]07:44
@sonney2kpickle27, for eternity08:17
@sonney2kwiking_, ping again08:17
wiking_sonney2k: pong09:04
-!- nube [~rho@] has joined #shogun09:06
-!- votjakovr [] has joined #shogun09:07
sonne|workwiking_: good morning!09:20
sonne|workwiking_: could you please give me some url for the feed?09:20
-!- hushell [] has joined #shogun09:20
hushellsonney2k: Hi, I got a strange problem. After included SGObject.h, I have to include Parameter.h to use SG_ADD, but CParameter has already been declaried09:24
votjakovrsonne|work: guten morgen :) i see, that you added evaluate_*() methods to black list for c# modular, is that temporal solution?09:25
sonne|workvotjakovr: good enough for some time :)09:26
hushellAnother question, how can I register some member whose type is const char*, or I have to use SGString?09:27
votjakovrsonne|work: ok09:27
sonne|workvotjakovr: it is totally unclear why using 2 SGVectors or other SG* datatypes works with all other typemaps but our csharp one09:28
sonne|workvotjakovr: needs to be bug reported / toy example created09:28
sonne|workvotjakovr: anyway not so bad09:28
sonne|workhushell: well depends :) what do you use your char* for09:29
sonne|workhushell: best is to use SGVector<char>09:29
hushellsonne|work: I want to have an identity09:32
-!- nube [~rho@] has quit [Quit: Leaving.]09:33
hushell"astring" cannot be converted to SGVector<char> implicitly09:33
hushellbut std::string is possible09:34
-!- nube [~rho@] has joined #shogun09:35
sonne|workhushell: asstring?09:37
hushellsonne|work: for the const char* member, I need to pass a "name" in the argument of constructor09:37
hushell"astring" means a string in c++ :)09:38
hushellI mean in a function call09:39
sonne|workhushell: as in void foo(int x, const char* name) ?09:40
hushellsonne|work: yep09:41
sonne|workhushell: yeah sure09:41
hushellso const char* cannot be used as a member? if we have to register it09:42
-!- iglesiasg [~iglesias@2001:6b0:1:1041:fda4:69d9:9772:7713] has joined #shogun10:01
-!- mode/#shogun [+o iglesiasg] by ChanServ10:01
sonne|workhushell: register means?10:22
sonne|workhushell: without more context I cannot really answer this10:23
hushellsonne|work: I solved it by using SGString, but I am wondering why we need to register member variables? register here means SG_ADD10:35
sonne|workyou only need that when you want to say that this variable needs to be saved (serialization) or could be used in modelselection10:36
hushellsonne|work: Thanks! then not everybody need to do that10:41
-!- van51 [] has joined #shogun11:07
sonne|workhey van5111:07
sonne|workgood morning11:07
van51sonne|work: hello11:07
van51sonne|work: good morning to you too11:07
sonne|workI was wondering how the training time is now and what parameters you used11:08
van51sonne|work: I think it's still taking a lot of time11:08
van51sonne|work: all I did was to specify C=411:08
sonne|workvan51: and epsilon?11:09
sonne|workvan51: any normalization?11:09
sonne|workas in make vectors norm =1 ?11:10
sonne|workno right?11:10
sonne|workthen it is no wonder :)11:10
van51sonne|work: no nothing like that :)11:10
sonne|workvan51: so what epsilon did you set then?11:11
van51sonne|work: yea I just wanted to get it running first11:11
sonne|workok 1e-3 with not properly scaled data will kill you11:11
sonne|workvan51: a standard trick is to divide the vector by the number of non-zero elements11:12
sonne|workvan51: so you should implement support for that in your features (optional of course)11:13
sonne|workfor n-grams it is rather easy since a constant11:14
sonne|workfor delimited words it depends on #words11:14
sonne|workvan51: OK?11:14
van51sonne|work: ok11:15
sonne|workvan51: just add a hack for the moment to see how fast it becomes11:15
van51sonne|work: so just before i returns the vector in the dense_dot?11:16
-!- nube [~rho@] has quit [Quit: Leaving.]11:16
van51sonne|work: actually it doesn't return a vector11:16
-!- nube [~rho@] has joined #shogun11:17
sonne|workvan51: it returns a scalar so just multiply with that normalization const11:17
sonne|worki.e. norm_const = 1.0/num_ngrams11:18
sonne|workvan51: with add_to_dense_vec you have to do it for each element11:18
sonne|workvan51: and dot the thing squared11:19
van51sonne|work: why dot the thing squared?11:27
sonne|workvan51: it is (a * norm_const) * (b*norm_const)11:29
sonne|work(both a & b are normalized)11:29
-!- iglesiasg_ [] has joined #shogun11:37
-!- iglesiasg [~iglesias@2001:6b0:1:1041:fda4:69d9:9772:7713] has quit [Read error: Connection reset by peer]11:37
-!- iglesiasg_ [] has quit [Client Quit]11:37
-!- iglesiasg [~iglesias@2001:6b0:1:1041:fda4:69d9:9772:7713] has joined #shogun11:37
-!- mode/#shogun [+o iglesiasg] by ChanServ11:37
van51sonne|work: so in a setting of 500 examples, c=4 , default e11:49
van51sonne|work: with converter it takes like 2-3 secs11:49
van51sonne|work: dot-features took -last night- 4 mins11:49
van51sonne|work: and I don't see an improvement with normalization11:50
sonne|workyeah but C=1 is probably for scaled data OK C=4 for unscaled as you have is way to high. I guess more in the range of 1e-311:51
sonne|workvan51: that cannot be11:51
van51sonne|work: there is a significant speedup with C=0.00111:59
van51sonne|work: but with normalization the results seem worse11:59
sonne|workvan51: sure results are not comparable12:00
sonne|workyou need different C12:00
-!- HeikoS [] has joined #shogun12:00
-!- mode/#shogun [+o HeikoS] by ChanServ12:01
-!- van51 [] has quit [Read error: Connection reset by peer]12:03
-!- van51 [] has joined #shogun12:04
-!- votjakovr [] has left #shogun ["Went to the store!"]12:14
sonne|workvan51: can you show me how you normalize?12:14
-!- nube [~rho@] has quit [Ping timeout: 248 seconds]12:15
van51sonne|work: yea12:15
van51sonne|work: one moment12:15
-!- lambday [67157e4f@gateway/web/cgi-irc/] has joined #shogun12:23
lambdayHeikoS: hi12:23
@HeikoSlambday:  hi!12:23
lambdayHeikoS: I tested with one matrix with pathetic condition number (10^4)12:23
lambdayand the accuracy is 1E-512:23
lambdaythe trace12:24
lambdayof log12:24
lambdayI think if we want more accuracy, we should use arprec12:24
lambday(for shifts, weights etc)12:24
lambdaythe accuracy I wanted is 1E-1912:24
@HeikoSthis is with direct solves?12:24
@HeikoSlambday: and can you easily try arprec?12:25
lambdayHeikoS: for Jacobi elliptic functions I already have the arprec version12:25
@HeikoSlambday: I see12:25
@HeikoSlambday: this might actually be the solver12:26
lambdayHeikoS: brb... a call12:26
@HeikoSlambday: I think thats fine for now (especially since this is only to test whether things work), for the real deal with sparse matrices and cocg_m, we should get a better accurac<y12:26
lambdayHeikoS: bakc12:29
lambdayHeikoS: yes...12:29
lambdayand also, this is the difference in the trace12:29
@HeikoSlambday: what do you mean?12:29
lambdaynot the norm of the difference of the approximated log(m) and actual log(m)12:30
@HeikoSah yes12:30
@HeikoSbut we have the exact trace right?12:30
lambdayyes... I was checking with octave... I'll soon add the eigen3 version soon12:30
@HeikoSlambday: okay12:31
@HeikoSlambday: good!12:31
lambdaynot too bad, right?12:31
lambdayhmm :)12:31
@HeikoSsounds good, yes :)12:31
@HeikoSso now, the more interesting things begin :D12:31
lambdayyes :D12:31
@HeikoSconjugate gradient pain ;)12:31
lambdaynext two days I can give fully to gsoc..12:31
lambdayweekends yay :D12:31
lambdayI should add sparse thing before going into cocg12:32
@HeikoSlambday: yes thats true12:32
lambdayoh and what about having a different base for cocg_m?12:33
@HeikoSlambday: explain this a bit12:33
lambdayI don't think we can manage it in the same interface as other solvers12:33
lambdaysince their solve returns an SGVector12:33
-!- wiking_ is now known as wiking12:33
-!- wiking [] has quit [Changing host]12:34
-!- wiking [~wiking@huwico/staff/wiking] has joined #shogun12:34
-!- mode/#shogun [+o wiking] by ChanServ12:34
lambdayfor cocg_m, we should return a SGMatrix instead12:34
lambdayand the sum can't get inside the solve12:34
lambdaybecause each of the solution vectors need to be multiplied with their corresponding weight before the sum12:34
lambdaygetting the weights inside the solve of cocg_m will go, but I don't think that's a good idea :(12:34
@HeikoSlambday: yeah you are right12:35
@HeikoSlambday: damn ;)12:35
@HeikoSdo you have a suggestion?12:35
lambdaytell me about it :'(12:35
lambdayno :'(12:35
lambdayexcept having a different base...12:35
lambdayit won't cost generality because I moved the m_linear_solver down the the implementation of CRationalApproximation12:36
lambdayso, CLogRationalApproximationIndividual will have CLinearSolver m_linear_solver, and CLogRationalApproximationCOCG will have C<suggest-something>Solver m_linear_solver12:37
@HeikoScant we use a base class for these two types of solvers?12:37
@HeikoSyou know what12:38
@HeikoSthats fine, your suggestion12:38
lambdayhow shall we differentiate the signatures? :(12:38
@HeikoSit *is* a different solver12:38
lambdayits just the return type that changes :(12:38
@HeikoSwhich does something different12:38
@HeikoSi.e. solve multiple systems12:38
lambdayyes it is...12:38
@HeikoSso thats fine12:38
lambdayplease suggest names (I suck at it :( )12:39
-!- iglesiasg [~iglesias@2001:6b0:1:1041:fda4:69d9:9772:7713] has quit [Ping timeout: 245 seconds]12:45
lambdayholy crap using eigen3 we get super duper accuracy! :-o12:48
lambdaythis is rational approximation: 4.6051701859880944667212:48
lambdaythis is rational approximation: 4.6051701859880944667212:49
van51sonne|work: I have to g2g12:49
van51sonne|work: I'll be back in 2-2.5 hours12:49
lambdayHeikoS: :D12:49
@HeikoSlambday:  wow :D12:50
-!- van51 [] has quit [Quit: Leaving.]12:50
@HeikoSlambday: wait what did you change there?12:50
lambdayHeikoS: look at the accuracy we got:: 2.664535259100375697e-1512:51
lambdaynothing! I just used eigen3's log instead of testing with what octave gives12:51
@HeikoSlambday: nice!12:51
@HeikoSyeah octave sucks ;D12:51
@HeikoSlambday: carefull about this though12:51
@HeikoSeigen3 probably uses a similar trick for computing matrix logs :)12:52
@HeikoSso its like running the same code twice12:52
@HeikoSbut it is still very good!12:52
lambday:) :)12:52
@HeikoSThis function computes the matrix logarithm using the Schur-Parlett algorithm12:53
lambdayeigen3's log gives the whole matrix12:53
@HeikoSno its different12:53
@HeikoSits the higham paper12:53
@HeikoSI tried that before, ours will be better for large one :)12:53
lambdayhope so :) :)12:53
@HeikoSvery very encouraging12:53
lambdayyessss!! :D12:54
lambdayI'll add this unit-test real soon!12:54
-!- iglesiasg [] has joined #shogun12:58
lambdayHeikoS: using arprec we got that ~1e15 accuracy, using normal float64_t we got ~1e-813:02
@HeikoSlambday: ok good, these are very useful values for the documentation later on13:02
@HeikoSlambday: so keep them, make them into unit tests13:02
lambdayokay :)13:02
lambdayI'll be back later.... :)13:23
lambdaysee you13:23
-!- lambday [67157e4f@gateway/web/cgi-irc/] has quit [Quit: lambday]13:23
-!- iglesiasg [] has quit [Quit: Ex-Chat]14:27
-!- Netsplit *.net <-> *.split quits: @HeikoS, pickle27, hushell, sonne|work, flxb, shogun-buildbot, zxtx_, naywhayare, @sonney2k, @wiking, (+1 more, use /NETSPLIT to show all of them)14:51
-!- Netsplit over, joins: @wiking, @sonney2k, shogun-buildbot14:56
-!- Netsplit over, joins: @HeikoS, hushell, zxtx_, pickle27, sonne|work, flxb, naywhayare, sanyam14:57
-!- mode/#shogun [-ooo sonney2k wiking HeikoS] by ChanServ15:11
-!- Netsplit *.net <-> *.split quits: hushell15:14
-!- Netsplit over, joins: hushell15:15
-!- Netsplit *.net <-> *.split quits: flxb, naywhayare, HeikoS15:18
-!- Netsplit over, joins: @HeikoS15:20
-!- Netsplit over, joins: flxb15:20
-!- naywhayare [] has joined #shogun15:26
-!- van51 [] has joined #shogun15:55
-!- van51 [] has quit [Client Quit]15:55
-!- van51 [~van51@] has joined #shogun15:56
-!- van51 [~van51@] has quit [Remote host closed the connection]16:08
-!- van51 [] has joined #shogun16:12
-!- kevin_ [] has joined #shogun16:32
-!- pickle27 [] has quit [Ping timeout: 276 seconds]16:36
-!- kevin_ is now known as pickle2716:40
-!- foulwall [~user@2001:da8:215:c252:482c:7add:959d:1be5] has joined #shogun17:09
-!- van51 [] has left #shogun ["QUIT :Leaving."]17:30
-!- sonne|work [] has left #shogun []17:40
-!- foulwall [~user@2001:da8:215:c252:482c:7add:959d:1be5] has quit [Remote host closed the connection]17:40
-!- lisitsyn [] has joined #shogun18:06
-!- nube [~rho@] has joined #shogun18:10
-!- nube [~rho@] has quit [Quit: Leaving.]18:32
-!- van51 [] has joined #shogun19:13
-!- hushell [] has quit [Ping timeout: 264 seconds]19:59
-!- mode/#shogun [+o sonney2k] by ChanServ20:02
@sonney2kvan51, I meant how you normalize20:04
@sonney2kvan51, as gist20:04
van51sonney2k: I'm not following :)20:07
van51sonney2k: there was the normalization in the gist I sent you earlier20:08
van51sonney2k: do you want something else?20:11
-!- hushell [] has joined #shogun20:17
-!- HeikoS [] has quit [Quit: Leaving.]20:21
@sonney2kvan51, oops looked at the wrong one20:46
pickle27sonney2k: my unit test is still failing on Travis, I set the random seed for CMath but I'm also using setRandom from Eigen320:46
pickle27do you know how to set the random seed for eigen3? I can't find how to do it20:46
@sonney2kvan51, ok some bug is in there still20:46
@sonney2kvan51, in line 12 it should be 1.0/((sv1.size()-3)*(sv2.size()-3))20:47
@sonney2kvan51, note that you do 1/sv1.size() which will always be 020:47
@sonney2kin line 4420:47
@sonney2kit should be 1.0/(sv.size()-3) for the same reason20:47
van51sonney2k: woops20:48
@sonney2kand line 62 should be20:48
@sonney2kand then line 65 should be removed20:48
@sonney2kand line 72 should be just += n_const;20:49
van51sonney2k: idd20:49
@sonney2kvan51, please fix and show me again20:49
van51sonney2k: I updated the gist20:53
lisitsynpickle27: hey20:57
@sonney2kvan51, looks good except for the missing ; in line 6320:57
@sonney2kvan51, so try again!20:58
van51sonney2k: yea compiler told me!20:58
@sonney2kvan51, btw how do you compile / what interfaces do you compile for?20:58
lisitsynpickle27: may be srand does the job20:58
@sonney2kvan51, I am kind of your compiler too20:58
@sonney2kvan51, quick new results pleas :-)20:58
@sonney2kit should be lightning fast now20:58
pickle27lisitsyn: I tried with srand and it didn't fix travis20:59
lisitsynsonney2k: struct a; template <typename T> struct b { }; struct a : b<a> { };20:59
@sonney2klisitsyn, van51's compiler not yours :P20:59
@sonney2kpickle27, why srand?21:00
lisitsynsonney2k: warum ich bin allein21:00
@sonney2kpickle27, CMath::init_random!21:00
lisitsynsonney2k: eigen's random21:00
@sonney2klisitsyn, is that needed?21:00
@sonney2kno idea what you do21:00
lisitsynsonney2k: I'd not use it actually21:01
lisitsyndidn't notice pickle27 used it21:01
pickle27lisitsyn: sonney2k I don't think thats the problem anymore looking into some other things21:01
pickle27sonney2k: lisitsyn I replaced using Eigens random with CMath, lets see what happens with travis now21:03
@sonney2kpickle27, but it works locally?21:04
pickle27sonney2k: its never failed for me21:04
pickle27its testing whether or not the end result is a permutation matrix21:05
pickle27on Travis there is a column that is all zeros21:06
@sonney2kvan51, how do you compile?21:07
@sonney2kvan51, any results already?21:07
van51sonney2k: on 50 examples, with C=0.001 it takes 75s21:08
van51sonney2k: with C=1 it takes 25s21:08
@sonney2k50k examples?21:08
@sonney2kor 50?21:08
van51sonney2k: I compile for static interface21:08
van51just 5021:08
@sonney2kwith or w/o optimizations21:08
van51sonney2k: on that machine right now it's with21:09
@sonney2kvan51, if you just need C++21:09
@sonney2kvan51, then you can do ./configure --interfaces=21:09
@sonney2kand then make / make install21:09
van51sonney2k: ah ok21:09
@sonney2kyou sure that it takes the right lib?21:09
van51sonney2k: yeah I believe so21:10
@sonney2kvan51, how many positive / negative examples has this?21:13
@sonney2kand ngram-size is what 3?21:14
pickle27sonney2k: lisitsyn why hasn't travis started on my latest commit?21:15
lisitsynpickle27: I guess it is enqueued21:16
@sonney2kvan51, look at page 89 in
@sonney2kvan51, table 4.421:16
pickle27doesn't look like anything is queued21:16
@sonney2kthat is a 'slow' method (compared to what you have) running on webspam21:16
@sonney2kit takes 2 secs for 100 examples21:17
@sonney2kvan51, try with n=821:18
pickle27lisitsyn: okay its building now, hopefully Travis likes it this time21:24
@sonney2kvan51, ok so lets do a quick benchmark21:38
@sonney2kvan51, take the 50 examples and just call add_to_dense_vec with all of them to some null vector and measure the time21:39
van51sonney2k: ok21:39
@sonney2kvan51, btw this is a good benchmark for dotfeatures anyway - so it makes a lot of sense to do this in the CDotFeatures class21:39
@sonney2kvan51, maybe there even is sth like this already in there21:39
van51sonney2k: on it21:39
@sonney2kvan51, indeed21:39
@sonney2kthere is21:40
@sonney2kvan51, jsut call benchmark_add_to_dense_vector()21:40
@sonney2kand benchmark_dense_dot_range()21:40
@sonney2kvan51, I would expect it takes <1s21:41
van51sonney2k: with default number of repeats?21:41
@sonney2kvan51, yeah21:41
@sonney2kit is averaging21:41
lisitsynpickle27: something is happening with your PR21:47
@sonney2kvan51, ok then if liblinear is taking > 1000 iterations you can get such bad results21:48
@sonney2kvan51, lets try SVMOcas instead of liblinear21:48
@sonney2kvan51, same syntax just CSVMOcas(C,data,labels)21:49
van51sonney2k: ok and I was looking for the class reference now :P21:49
@sonney2kvan51, I will have to leave in 10 minutes - so please give me a result before :)21:50
lisitsynsonney2k: we need to fix lua detection21:51
lisitsynlet me try to do that21:51
@sonney2kvan51, in any case you should update olivier/benoit on your progress and even send them the example you wrote and describe what you did21:51
@sonney2klisitsyn, hmmhh so I guess I broke it21:51
lisitsynif it finds lua it tries to compile *even* if no headers are there21:51
@sonney2klisitsyn, I was addign support for lua52 some months back21:52
@sonney2kI guess I broke sth21:52
van51sonney2k: now it takes 0.88s for 100 examples21:52
lisitsynsonney2k: well it should just fail with no headers21:52
lisitsynI will try to patch it now21:52
van51sonney2k: it's much much faster21:52
@sonney2kvan51, ok then give it say 10k examples21:53
@sonney2kvan51, it might be that liblinear recovers with many more examples21:53
@sonney2kvan51, liblinear is numerically not that stable21:53
lisitsynbtw I can confirm now iphone has libsvm inside :D21:54
lisitsynkind of huge success for these guys21:54
@sonney2kweird though21:57
@sonney2klisitsyn, what do they learn with libsvm/liblinear21:57
@sonney2kvan51, btw did you enable progress output?21:57
lisitsynsonney2k: no idea but license is inside21:57
lisitsynsonney2k: face recognition? who knows21:58
van51sonney2k: no I did not21:58
van51sonney2k: I have a run that finished iin 112s21:58
van51sonney2k: but the first one segfault'ed21:58
van51sonney2k: and the next one said corrupted double-linked list21:58
@sonney2klisitsyn, I mean I can understand they *learn* some models on some cluster(s) but then just applying stuff doesn't need a license or anything21:58
lisitsynsonney2k: no it is in license of any iphone21:59
lisitsynso some code is running on iphone21:59
@sonney2klisitsyn, no face recog etc that is all pretrained21:59
@sonney2kvan51, sounds bad21:59
@sonney2kvan51, enable progress output!21:59
lisitsynsonney2k: I know21:59
@sonney2kvan51, or so21:59
@sonney2kvan51, not good about the crash - valgrind on some subset...22:00
@sonney2kvan51, 10k examples took >6000s with the 'old' approach so yes about 100 sounds right22:01
@sonney2kvan51, alright I am off - keep it going!22:01
van51sonney2k: ok! at least that is promising22:02
lisitsynvan51: can you explain me the things you are doing in a few words?22:03
van51lisitsyn: sure22:03
van51lisitsyn: right now we are trying to benchmark CHashedDocDotFeatures which stores internally a CStringsFeatures object and whenever a dot product is required it tokenizes the appropriate string feature vector on the fly22:04
van51lisitsyn: then hashes the tokens to a dimension d22:05
van51which is much smaller than the dimension of the entire document collection22:05
van51and the idea then is that you train a linear model on that smaller dimension22:05
lisitsynso the internal storage is still strings?22:06
lisitsynwhy is it more efficient than just store hashes?22:07
lisitsynI mean sounds like hashes are compressing things22:08
lisitsynvan51: just trying to understand ;)22:09
van51lisitsyn: well from what I understand pre-hashing the tokens takes time and space22:09
van51lisitsyn: maybe not that much now that the collection fits in memory22:09
lisitsynsay I have22:10
lisitsyn1 mb text file22:10
lisitsynhow much space hashed thing takes?22:10
van51it depends on the hash size that you specify22:11
van51imagine you try to fit that text file in a vector of size 2^16 for instance22:12
van51lisitsyn: this post here explains it well :
lisitsynvan51: but that's BoW right?22:14
lisitsynI mean transforming doc -> 2^16 binary features22:14
van51lisitsyn: actually it's a count22:15
van51lisitsyn: and the BoW representation would have a large dimension of all posible tokens, say N22:15
van51lisitsyn: here we specify a dimension d << N22:15
lisitsynvan51: one question that would clarify22:16
lisitsynBoW is indeed memory inefficient (like N possible tokens)22:16
lisitsynbut you say when using hashing we get d<<N, why not to compute them explicitly?22:17
van51lisitsyn: explicitly you mean beforehand?22:18
lisitsynvan51: yes22:18
van51lisitsyn: well I'm not an expert, I'll just tell you what I have read and come to understand22:19
lisitsynvan51: yes I am not expert at all too :)22:19
van51lisitsyn: precomputing it would take up some time before-hand and also more space22:20
van51lisitsyn: either on disk or in memory22:20
lisitsynvan51: so it takes more time with hashing but less space?22:20
van51lisitsyn: I'm guessing is the good old trade-off yea22:21
lisitsynalright thanks22:21
van51lisitsyn: also I think it would be hard if your collection had to be streamed22:21
-!- shogun-notifier- [] has joined #shogun22:22
shogun-notifier-shogun: Sergey Lisitsyn :develop * 8a34d14 / src/configure:
shogun-notifier-shogun: Fixed lua detection22:22
lisitsynnaywhayare: I guess that ^ fixes the thing you reported on lua22:22
naywhayarerockin'.  glad I could help a bit :)22:24
naywhayare(even though technically I only pointed out the problem and didn't quite help)22:24
naywhayarethanks :)22:24
lisitsynnaywhayare: thanks fro reporting!22:24
-!- hushell [] has quit [Ping timeout: 268 seconds]22:41
-!- travis-ci [] has joined #shogun22:45
travis-ci[travis-ci] it's Sergey Lisitsyn's turn to pay the next round of drinks for the massacre he caused in shogun-toolbox/shogun:
-!- travis-ci [] has left #shogun []22:45
shogun-buildbotbuild #1197 of bsd1 - libshogun is complete: Failure [failed test_1]  Build details are at  blamelist: Sergey Lisitsyn <>22:50
-!- iglesiasg [] has joined #shogun23:05
-!- mode/#shogun [+o iglesiasg] by ChanServ23:05
shogun-buildbotbuild #1320 of deb3 - modular_interfaces is complete: Failure [failed test python_modular]  Build details are at  blamelist: Sergey Lisitsyn <>23:09
--- Log closed Fri Jul 05 23:23:30 2013
--- Log opened Fri Jul 05 23:23:36 2013
-!- shogun-toolbox [] has joined #shogun23:23
-!- Irssi: #shogun: Total of 13 nicks [2 ops, 0 halfops, 0 voices, 11 normal]23:23
-!- Irssi: Join to #shogun was synced in 7 secs23:23
pickle27lisitsyn: yeah I saw, it still failed though, I don't under stand23:46
pickle27lisitsyn: I'll discuss with you later maybe tomorrow? the result should be a permutation matrix and on my systems it is, but on Travis one of the columns doesn't have a one23:47
pickle27lisitsyn: it looks like its usually the first column too23:47
pickle27lisitsyn: I don't know whats up23:47
-!- pickle27 [] has quit [Quit: Leaving]23:47
--- Log closed Sat Jul 06 00:00:19 2013