@sonney2kanyway - need to sleep00:12
@sonney2kcu all00:12
blackburnsee you00:14
blackburngoing to sleep too00:25
@sonney2kok I lied - we have irclogs now
@sonney2kdvevre, ^^00:41
@sonney2kenjoy and now really good night00:41
dvevresonney2k: awesome!00:41
dvevreand good night!00:41
splovinggood night00:43
serialhexYAY irc logs!!!01:09
serialhexnow i dont have to slave over learning SQL for a little while longer :D01:09
dave718Anyone have an example of working python modular code for RealFileFeatures?  I've tried building the binary file by hand and adding the data to RealFileFeatures() without args but no luck.01:31
dave718i.e. I've tried both (making the binary file, and building it up within python).01:32
-!- vivekp [~vivekp@] has joined #shogun06:20
-!- siddharth [~siddharth@] has joined #shogun06:41
-!- vivekp [~vivekp@] has quit [Read error: Connection reset by peer]07:24
-!- sploving [sploving@] has joined #shogun09:27
splovinghello sonney2k09:29
-!- vivekp [~vivekp@] has joined #shogun09:43
-!- siddharth [~siddharth@] has joined #shogun10:52
siddharthhi all10:52
-!- blackburn [~qdrgsm@] has joined #shogun12:14
-!- sploving [sploving@] has joined #shogun12:22
splovinghello sonney2k, are you here?12:22
-!- vivekp [~vivekp@] has joined #shogun12:48
-!- siddharth [~siddharth@] has joined #shogun12:48
-!- dvevre [b49531e3@gateway/web/freenode/ip.] has joined #shogun14:02
@mlsecgreat you are back, betty14:32
@bettyboomlsec: I sent an email quite a while back, given that the code had more to do with my potential project than with anything else in shogun14:32
Guest22040Is there anyone online who knows a bit of Shogun? ;)15:11
-!- sploving [~sploving@] has joined #shogun15:14
Guest22040sploving: Hi. Do you have any experience with Shogun?15:22
josipGuest22040: don't ask to ask - just ask. There might be people around that can help you15:26
splovingGuest22040,  yeap.15:28
Guest22040It is written that Parallelized Code and k-means algorithm are supported by Shogun. Do you know if I can use Shogun to process >6GB datasets and to parallelize the computation process?15:30
splovingI am not a expert about it. I am familiar with the modular typemap15:32
josipyou need to fit >6GBs in memory15:34
josipgiven that they're not sparse15:35
josipif I'm not mistaken15:36
josipyou might also want to look at #hadoop if you have really massive datasets15:36
josipwell, it will start swapping out otherwise and it will probably make it much slower15:38
josipbut you should better wait until someone more knowledable comes15:38
Guest22040I know about the hadoop but looking for sth where I do not have to install the hdfs15:38
Guest22040the dataset could be 4GB but could be also 6, 8, 20 GB15:39
josipwell, if you can fit it in memory it should work I think - but might be very slow if a lot of it is swapped out15:39
josiptry it on a small subset first15:39
Guest22040what about the complexity15:39
Guest22040Are clustering algorithms parallelized?15:40
josipin general? K-means can be parallelized15:40
Guest22040I know. Is it? :P15:41
Guest22040in Shogun? Do you know maybe? ;)15:41
josip not yet I guess15:42
josipor rather not yet ~6 months ago15:42
josipyou should wait for sonney2k tho15:44
josipGuest22040: hadoop is to troublesome to install?15:49
Guest22040no but do not have access to such cluster15:50
Guest22040I assume no ;)15:50
Guest22040I haven't tries15:50
josipthere's even a link to a CUDA implementation if you have an nvidia card15:51
Guest22040ok thx15:52
-!- sploving [sploving@] has joined #shogun16:31
-!- blackburn [~qdrgsm@] has joined #shogun16:44
-!- dvevre_ is now known as dvevre17:59
vetocHi dvevre :)18:07
-!- akshayb [b49531e3@gateway/web/freenode/ip.] has joined #shogun18:08
@sonney2kblackburn, I heard my name?19:38
blackburnsonney2k: ehh?19:38
blackburn(08:10:05 PM) akshayb: blackburn chutiya hai19:39
blackburn(08:10:48 PM) akshayb: maaf karna dvevre lode hai!19:39
blackburnit's all about this :D19:39
@sonney2kblackburn, not a language you understand?19:39
blackburnyeah ;)19:39
blackburneven don't know what it is, hindu?19:40
blackburnsonney2k: how it is going?19:41
@sonney2kblackburn, live is a mess ... was weeding in the garden (and everyone except myself is sick here).19:43
blackburnsick? why? I heard it is warm in Deutschland19:43
blackburndamn segfault!19:45
@sonney2kyes it is very nice weather... no idea why just now.19:47
@sonney2kbut 40 C fever is no fun...19:47
-!- dvevre [b49531e3@gateway/web/freenode/ip.] has joined #shogun19:48
blackburn40? I hope all there will recover fast and you will not sicken19:49
blackburnsonney2k: now looks like ROC? ;)
@sonney2kblackburn, why are there so many steps in there?19:57
@sonney2kdoesn't look correct to me (more like a overestimated ROC curve)19:57
blackburnsonney2k: I randomly placed +1 where -1 was and vice versa19:58
blackburnin labels19:58
blackburndoes it depend on this?19:58
@sonney2kin the predicted labels or the true ones?20:00
blackburnin true ones20:00
@sonney2kblackburn, I would start with the following labels:20:01
@sonney2k-1 +1 for true ones20:01
@sonney2kand outputs +1 +120:01
blackburnsonney2k: btw it is LDA for modified label_train_twoclass.dat20:02
@sonney2kblackburn, just don't use any classifier at all for the test20:02
@sonney2kbut only manually set labels20:03
blackburneh.. sonney2k, is it a good example?20:05
blackburnin that case we have only one point20:05
@sonney2kit should be a diagonal line20:06
blackburnoh, sorry, 220:06
@sonney2kfrom 0,0 to 1,120:06
blackburnrgh! found bug20:06
blackburnROC [[ NaN   0.]20:06
blackburn [  1.   1.]]20:06
blackburnsonney2k: yeap, it is20:08
blackburnsonney2k: can ROC be lower than diagonal..?20:09
blackburni tested it on (true: -1 1 1) (predicted: 1 1 -1)20:10
blackburnand the points are (0,0) (1,0.5) (1,1)20:11
josipsonney2k: someone asked if there is a parallel implementation of k-means in Shogun. is it implemented as of now>?20:11
blackburnfound mistake20:11
blackburnjosip: iirc it uses distancemachine class which are parallel20:13
josipso only the calculation of pairwise is distributed?20:14
josippairwise distance*20:15
josiperr parallel*20:15
blackburnjosip: seems cluster distance is parallel too, but we could better wait for answer of Soeren (cause he is author) :D20:16
@sonney2kjosip, it is parallel but not memory efficient (computes distance matrix)20:24
blackburnsonney2k: can you give me an another test for ROC? ;)20:29
@sonney2ktrue -1 +1 , pred. -1 +1 :)20:30
blackburnsonney2k: 1 1 -1 both true and predicted gives ROC (0,0) (0,1) (1,0) and auROC 1.020:30
@sonney2kand +1 -1 for pred :)20:30
blackburnsonney2k: eh.. about last one20:31
blackburnis it good that I have (0,0) (1,0) (1,1)?20:31
blackburnauROC 0.020:31
blackburnit seems to be right, but don't know exactly20:32
@sonney2kme neither but at least auROC is ok20:33
-!- siddharth [~siddharth@] has joined #shogun20:33
blackburnhm.. okay, will push it just after some doc20:35
@sonney2kblackburn, just compare it to the python script on some realistic data sets20:36
blackburnsonney2k: can i trust it? you said it have bug20:37
@sonney2kthe python one? it should be ok, just not when there are multiple outputs that are the same20:37
blackburnsonney2k: okay20:38
blackburnready for 'execution' ;)20:55
blackburn*using axe or any other weapon20:55
* sonney2k of course I will be using hattori hanzo manufactured swords if necessary20:56
@sonney2kas any shogun would.20:56
blackburnoh so I will drink vodka21:00
blackburnas any russian do :D21:00
-!- dvevre_ is now known as dvevre21:26
@mlsecSorry guys, but ROC is best evaluated using continuous scores22:00
@mlsecROC is deeply rooted in signal processing22:01
blackburneh.. what you mean?22:04
blackburnbecause there is no difference in evaluation algorithm when continuous or not22:06
@mlsecI was referring to: sonney2k: [20:30:10] true -1 +1 , pred. -1 +1 :)22:06
blackburnI tested it on 1.1, 1.2, -1.3, etc22:07
blackburnthe other reason why I made scores this way: mldata-utils ROC don't handle with equal scores22:07
@mlsecThat's better.22:07
@mlsecThe interesting part about ROC curves is the interpolation for continuous scores22:08
@mlsecEg pessimistic, average and optimistic22:09
blackburnah. read some about that in fawcett's paper22:09
@mlsecYes. Good one22:09
@mlsecIs there also a section averaging ROCs?22:10
@mlsecThat's also not trivial22:10
blackburnwhere 'there'? ;)22:10
@mlsecIn the paper of Fawcett?22:11
blackburnyeap, it has a section about it22:11
@mlsecI am keeping msg short, as I am writing from a smartphone22:12
blackburnok, just not understood where exactly, in class I made or in fawcett's paper22:12
blackburnI wonder how you use irc on your smartphones :) it seems to be not so convenient22:14
blackburn*Soeren did last week too22:14
@mlsechehe. it's funny22:14
@mlsecAnyway. I had a lot of fun with writing ROC code (interpolation, AUC bounded at FP, averaging)22:16
@mlsecSo I am looking forward to Shogun contributions22:17
blackburnoh I had a lot of struggles doing simple ROC22:17
blackburnmade a pull request with it22:17
blackburnnow i'm doing some 'refactoring' at shogun.Evaluation22:18
