I checked! but for real I think they are same :-/
because for dot, they have to take the conjugate of p or something like that
lambday: what are the consequences?
HeikoS: consequence is, if for this case (real linear operator with real vector but complex shift) dot is same as the other one, then we won't have to call it COCG_M.. it will become just CG_M
I'll check whether they give same results for both of these
HeikoS: oh and the meeting is tonight, right? 19:00 GMT?
lambday: yes tonight
HeikoS: what will be the topic? you said its really crucial for us to pass the mid-term
lambday: organisational
is everyone on track? problems? progress?
what does the mid-term mean?
how are the example ideas going?
ipython notebook examples
hey guys
HeikoS: ipython notebook example?
lambday: yes, we want all students to write one :)
http://nbviewer.ipython.org/5982623
HeikoS: okay...
lambday: remember everyone should do a big example? in addition to small illustrating ones? the big one should be in the form of such a notebook
like a report for gsoc
HeikoS: yes... ours will be on the huge sparse matrix
HeikoS: alright... gotcha
can I help with get_name methods in classes? it breaks many unit-tests.
gsomix: I fixed another one... will add in the next PR
gsomix: I will check the unit tests now, it might be that I need help :)
lambday:  did you do any recent changes to sparse features?
unit test fails, but did not before
HeikoS: I added complex
HeikoS: which one fails?
lambday: what do the unit tests on your machine do=?
SparseFeaturesTest.subset_get_full_feature_matrix_smaller
HeikoS: I'm checking
HeikoS: mine gives segfaults in ProductKernelTest.test_array_operations and does not progress any further :(
lambday: do you have the latest gut
giut
git
and make clean in tests before? yes just did a git pull
checking again
HeikoS: yes.. it gives segfaults..
ok
will try to fix it
HeikoS: but I added that before the clone unit-tests
it didn't give this fault that time :(
lambday: I know I wrote these tests 2 weeks ago and they worked
ah I should not have let the unit tests fail for so long :(
now its hard to detect where this came from
:(
when I added complex for sparse features, these all were still good
I didn't change anything in past 1 week in that I guess
ok
thanks for checking
HeikoS: no problem.. let me know if I can do anything :(
foun dit
what was it?
changes in sparse features
there are more problems
good that we have some tests
HeikoS, hey...
sonney2k: hi
sonney2k: oh man, I will fix the tests asap now
some things already slipped through
sonney2k: maybe we should disable the equals tests for now
until they work
HeikoS, the other options is disable
yes
sonney2k: yeah maybe thats a good idea
sonney2k: do you know how to do that? HeikoS, well I if you tell me what fails I just put the classes in some blacklist
sonney2k: no thats not the way to go
just disable the automated ones
should be fine
thunderstorm!
HeikoS, no it is the way to go so... we then know the failing tests
and can one by one fix them
HeikoS, good luck
I have yet to see a cloud here
haha :)
sonney2k: no its better to work on the equals tests at once
they are not stable yet
sonney2k: hmm like a combiner?
disable all of them and try to fix before acitvating
gsomix, thanks a lot! much easier to digest
sonney2k: are you referring to the case Benoit mentioned or in general?
wiking: around?
van51, like a CPreprocessor
sonney2k: I'm so sorry for this. =___=
van51, no benoit's speedup is excellent
van51, apart from that it is exactly like you do it
sonney2k: yeah I think it could be done in a CPreprocessor
sonney2k: it is very costly though in dense features..
van51, I am just thinking of a framework that we can add to DotFeatures to compute other stuff
like quadratic features or the normalization you did
sonney2k: yea sure I'm all for that
van51, anyway for now do the quadratic features as you do them
sonney2k: I've got a question
van51, shoot
sonney2k: you suggested yesterday to compare the results to  PolyKernel
and that's what I tried to do in the example
sonney2k: I think I also computed the terms like you said
sonney2k: HeikoS is meeting today? gsomix: yes
van51, and?
HeikoS: UTC?
sonney2k: the quadratic version is much slower
especially in high dimensional data
gsomix: see email
but that's expected I guess
ah, thanks
for how well they do now
evening
should I test them on the same machine, with the same settings?
HeikoS, yay! http://www.heise.de/newsticker/meldung/Hoster-OVH-gehackt-Wir-waren-nicht-paranoid-genug-1921721.html
that's ours
sonney2k: what I did was to use a linear machine again, svmocas, but it depends on the settings
sonney2k: our webserver?
nice!
good luck we are open-source
van51, ahh no do it differently
van51, create a very small example with 3 features / 2 examples
van51, then compute the PolyKernel (get_kernel_matrix())
van51, but don't use any normalization
van51, this polykernel has degree 2
build #1497 of deb1 - libshogun is complete: Failure [failed test]  Build details are at http://www.shogun-toolbox.org/buildbot/builders/deb1%20-%20libshogun/builds/1497  blamelist: Soeren Sonnenburg
van51, then use the dotfeatures and also compute the polykernel w/o normalization for degree=1 !
van51, elements of the two matrices should be the same then
lambday, I have an issue - with c++11 the diag[i].imag()-=0.75 in the test no longer works - do you know what the syntax for c++11 is? sonney2k: ok got it :)
HeikoS: I think we just have to patch line 22 here
https://github.com/shogun-toolbox/shogun/blob/develop/tests/unit/base/clone_unittest.cc.py
see you later during the meeting guys
iglesiasg: see you man :)
lisitsyn: yes, cool, could you do that?
iglesiasg: see you!
HeikoS: we gotta think a bit about that tridiagonal solver and then coloring.. :-/
HeikoS: krylstat uses colpack
brb
lambday: yes indeed
HeikoS: is multiclass svm abstract?
shogun-buildbot: leave me alone! :3
HeikoS, I cannot sense any luck here...
sonney2k: ?
sonney2k: just fixed another 5 bugs
getting there
but its a lot of work alone
sonney2k: is multiclass svm abstract?
HeikoS, please push them!
lambday: around? 13:28 lambday: please have a look into sparse features how the feature matrix is registered as a parameter 13:29 this work 13:29 the way in SparseMatrixOperator does not 13:29 I will fix it 13:29 When I'm running "make -C shogun check-examples-r_static" in src, it recompiles everything, even if I just did a fresh build. 13:31 Anyone knows why? 13:31 Is there a special reason? 13:31 thoralf, it recompiles the examples always yes 13:31 shogun: Heiko Strathmann :develop * 1db31c8 / / (2 files): https://github.com/shogun-toolbox/shogun/commit/1db31c8a6bbb38591459d3ab889f6bb630fbf593 13:31 shogun: fixed some crashes in sparse features unit tests 13:31 shogun: Heiko Strathmann :develop * 97eb3ac / src/shogun/base/Parameter.cpp: https://github.com/shogun-toolbox/shogun/commit/97eb3acce2fdf852c19a1c89d389d68c3a6ce1b6 13:31 shogun: fixed a memory issue related to uninitialised memory which should be NULL 13:31 shogun: Heiko Strathmann :develop * e5f8b63 / src/shogun/mathematics/logdet/SparseMatrixOperator.cpp: https://github.com/shogun-toolbox/shogun/commit/e5f8b63e7e479fca83d760c6c9552cafe251675a 13:31 shogun: changed sparse matrix parameter registering 13:31 shogun: Heiko Strathmann :develop * d3697bf / / (4 files): https://github.com/shogun-toolbox/shogun/commit/d3697bf41307c13f6e1e44f391ee2e84eac232b3 13:31 shogun: Merge pull request #1274 from karlnapf/develop 13:31 shogun: 13:31 shogun: Fix segfaults in unit tests 13:31 sonney2k: No, not only the examples.  Even the library. 13:32 no way 13:32 thoralf, only if you git pull inbetween 13:32 sonney2k: Even if I'm calling "make -C shogun check-examples-r_static" two times consecutively, it re-builds the library on the second call. 13:34 sonney2k: No git operation in between. 13:34 build #1499 of deb1 - libshogun is complete: Failure [failed test]  Build details are at http://www.shogun-toolbox.org/buildbot/builders/deb1%20-%20libshogun/builds/1499  blamelist: Heiko Strathmann 13:38 build #1500 of deb1 - libshogun is complete: Failure [failed test]  Build details are at http://www.shogun-toolbox.org/buildbot/builders/deb1%20-%20libshogun/builds/1500  blamelist: Heiko Strathmann 13:42 build #1498 of deb1 - libshogun is complete: Failure [failed test]  Build details are at http://www.shogun-toolbox.org/buildbot/builders/deb1%20-%20libshogun/builds/1498  blamelist: Soeren Sonnenburg 13:42 sonney2k: Okay, autosave touched the timestamp of the Makefile. 13:42 thoralf, hah! 13:43 HeikoS, I don't understand why tests fail just now? The sparse change is from June 11 ... 13:43 -!- gsomix [~Miranda@80.234.25.58] has quit [Quit: Miranda IM! HeikoS: just checked.. I am not sure what atomic does.. some sync/mutex stuffs?
van51, I see the bug
van51, let me comment it
lambday, yes it is guaranteed that when >1 thread do +=1 the result will be correct
lambday, otherwise we get crashes/leaks
iglesiasg, cool you have a customer now :)
sonney2k, yeah! haha
votjakovr: hey!
votjakovr: good to see you  finally :)
votjakovr: yes exactly this is what I want to do, but I would like to keep it in a general form so that we can easily extend it for the EP
votjakovr: also for the covariance, we should use the matrix inversion lemma form where we do only have to invert B, which is already available in the implementation
HeikoS: Ok, i'll do it.
sonney2k: so you don't keep a counter anymore
sonney2k: the index is hashed for a new index and the value is added there?
votjakovr: so something like get_posterior_approximation_mean/cov which returns a vector and covariance matrix
and this as an optional method of CINferenceMethod which is then overloaded in LaplacianInference and EPInference
votjakovr: if you could do that next, this would be extremely useful
since I currently need it for my research :D
and my python code is so low
and next thing should be I think getting the logit classifier to work
HeikoS: ok, i'll do it :)
votjakovr: nice :) let me know how it goes
van51, so which algorithm did you mean?
I thought you meant a)
sonney2k: a) is for the hashed doc features
sonney2k: c) would be quadratic for dense, right?
yeah that is what I was referring to
you need to compute the hash for some n-gram to get an index
good night people
gsomix, good night!
van51, you don't have an index like you have with dense
van51, agreed?
sonney2k: yes
van51, this doesn't make a difference for linear features
sonney2k: I think for tokens, vw concatenates them and hashes that
van51, because when you do a) and get the same h_idx multiple times you just add 1 to the value each time
OK?
yeap
if you go quadratic features though
you would need to know how often h_idx is the same
because you would do count[h_idx_1] * count[h_idx_2]
that is pretty annoying
van51, agree?
sonney2k: yes
sonney2k: I see that issue now
van51, you don't need that with dense
that is what I wanted to say
sonney2k: ok then :)
with dense you have the value and the index
sonney2k: well the answer for that is in vw, so I will read the code that Benoit suggested and see what I can do
sonney2k: yes idd
sonney2k: ok, so coming up : a PR to fix what I was doing wrong
sonney2k: and a PR for quadratic on dense
actually an update on thath
van51, I only see 2 solutions currently - keep all the h_idx around, sort them and then go over them or use some hashmap on them and then iterate over the values
van51, yeah does quadratic work now?
sonney2k: I had rolled back my code to fix that
sonney2k: so I will do it again now
fix what?
sonney2k: I'm confident that it'll be working though
sonney2k: what I was doing wrong in hashed dense
sonney2k: if you want to browse a bit through the commit now to understand what I mean, it's here : https://github.com/van51/shogun/commit/a46a05f62d0bc6779b20c87e6c41e22819ad92a5
van51, are we talking linear or quadratic now?
sonney2k: linear
b'coz I was talking quadratic :D
van51, but nevertheless comparing it to a linear kernel is an excellent test I added param reg for SGSparseMatrix that's why I did it that way in CSparseMatrixOperator :( 14:08 lambday, look at what I did... we have some C++11 for some stuff now (optional...) so the old syntax was causing issues 14:09 sonney2k: okay I am checking 14:10 sonney2k: the quadratic features I m working on now, are for hashed dense features, so the resulting dot product is from the hashed representations and it's different 14:13 sonney2k: its awesome 14:13 lambday, ?? 14:13 sonney2k: so we can use c++11 specific things? I heard that they have all those map lambda things too 14:14 van51, what do you get? what are the examples? 14:14 lambday, well you have to ifdef HAVE_CXX11 though and we might need to add a particular test to configure for the feature you are trying to use? 14:15 https://gist.github.com/van51/6061917 14:15 sonney2k: ^ 14:15 sonney2k: as of now I am not using anything specific... but good to have that test if we can use some.. I'll try something 14:16 lambday, I use that only to have no overhead in SGReferencedData (atomic for refcount is just 4 bytes) 14:17 -!- nube [~rho@116.90.239.13] has quit [Ping timeout: 276 seconds] 14:18 van51, that all doesn't make sense look at http://www.shogun-toolbox.org/doc/en/current/classshogun_1_1CPolyKernel.html 14:18 it should be (5+1+1)**2 == 49 for the first element 14:19 sry 14:20 (5*5 + 1*1 + 1*1)**2 = 729 14:20 so kernel matrix is correct 14:20 sonney2k: just checked.. I am not sure what atomic does.. some sync/mutex stuffs? 14:23 van51, I see the bug 14:24 van51, let me comment it 14:24 -!- nube [~rho@116.90.239.3] has joined #shogun 14:25 lambday, yes it is guaranteed that when >1 thread do +=1 the result will be correct 14:25 lambday, otherwise we get crashes/leaks 14:26 iglesiasg, cool you have a customer now :) 14:27 sonney2k, yeah! haha 14:28 shogun: Soeren Sonnenburg :develop * 6681582 / src/shogun/base/SGObject.cpp: https://github.com/shogun-toolbox/shogun/commit/6681582dd33849c6c2d87cb6d4af3178efe57d20 14:28 shogun: add CSGObject as supported serialization PT 14:28 sonney2k: I get why you say that there, but that's just the seed 14:28 sonney2k, let see if we get someone apart from me to do something with the structured framework 14:28 sonney2k:  I mean the value is still hashed and discretized so the kernel matrices will still be different 14:28 van51, errm the value should not be hashed but the index! 14:29 sry misread that 14:29 van51, we want to hash the index - the value stays v[i]*v[j] 14:30 sonney2k: so you don't keep a counter anymore 14:31 -!- nube [~rho@116.90.239.3] has quit [Quit: Leaving.] 14:31 sonney2k: the index is hashed for a new index and the value is added there? 14:31 -!- travis-ci [~travis-ci@ec2-23-22-213-64.compute-1.amazonaws.com] has joined #shogun 14:34 [travis-ci] it's Heiko Strathmann's turn to pay the next round of drinks for the massacre he caused in shogun-toolbox/shogun: http://travis-ci.org/shogun-toolbox/shogun/builds/9389241 14:34 -!- travis-ci [~travis-ci@ec2-23-22-213-64.compute-1.amazonaws.com] has left #shogun [] 14:34 sonney2k, I actually had come across some work by Sebastien (who asked the question) a few months ago 14:35 the errata list of a book 14:35 van51, yes exactly 14:36 van51, clear or not? 14:41 sonney2k: i get what you mean, it's just different from what I had understood in the beginning 14:42 van51, ok - when we explode feature spaces and compress them with hashing we do this only on the indices. so quadratic is just that 14:45 anyway I guess now you know what to do 14:45 build #1503 of deb1 - libshogun is complete: Failure [failed test]  Build details are at http://www.shogun-toolbox.org/buildbot/builders/deb1%20-%20libshogun/builds/1503  blamelist: Soeren Sonnenburg 14:47 sonney2k: ok so my question is this: this addition of the value to the hashed index is also for numerical features w/o quadratic or just when you want the index of quadratic features? 14:48 sonney2k: I mean should the default behavior of hasheddensefeatures for ints w/o quadratic be to add the value to the hashed index, or  keep a counter like it does now? 14:49 -!- gsomix [~gsomix@80.234.25.58] has joined #shogun 14:50 privet :P 14:51 privet :) 14:51 sonney2k, around? 14:54 gsomix, what's up? 14:58 sonney2k, another part 14:58 just a moment 14:58 -!- travis-ci [~travis-ci@ec2-23-22-213-64.compute-1.amazonaws.com] has joined #shogun 14:58 [travis-ci] it's Soeren Sonnenburg's turn to pay the next round of drinks for the massacre he caused in shogun-toolbox/shogun: http://travis-ci.org/shogun-toolbox/shogun/builds/9389778 14:58 -!- travis-ci [~travis-ci@ec2-23-22-213-64.compute-1.amazonaws.com] has left #shogun [] 14:58 van51, I couldn't parse the sentence sorry - for quadratic features it is *always* like v[i]*v[j] + hash(i*size +j) 15:00 van51, does that explain it? 15:00 sonney2k, https://github.com/shogun-toolbox/shogun/pull/1275 ~500 lines :( 15:01 sonney2k: yes I think I'm clear on quadratic features 15:02 sonney2k: apart from them, on the simple case of using hashing on dense features of ints 15:02 sonney2k: when you get the hashed index, you add the value of the current dimension or do you +1? 15:03 sonney2k: from what you're saying about the quadratic features it follows to do the first 15:05 sonney2k: but I remember asking Olivier and telling me the second 15:05 van51, you are thinking in terms of hashedDocFeatures now 15:13 van51, there v[i] = 1 for all i 15:13 van51, except when you normalize then it is some other constant 15:13 sonney2k: so the current behavior of hasheddense is incorrect 15:14 van51, so v[i]*v[j] should be c**2 15:14 van51, did you fix it now? 15:15 s/now/already? 15:15 sonney2k: not yet 15:15 then not :) 15:15 sonney2k: gotcha. that's easy to do without image uploading, but if the demo converts the image to matrices and downsample in the client side and just do json exchange with server, not real file upload. Can I do it in that way?
sonney2k: I will close this PR to fix this first
van51, I then don't understand your question
foulwall, you mean you still upload images to the server? what I don't like about it is that you have to think *HARD* about security
sonney2k: heh
sonney2k: I'll comment on my PR
sonney2k: maybe on the code it will be easier to express what I mean
not the file but a json with the matrix, the convert can be done in client.
foulwall, you mean a byte matrix that you check for sizes etc on the server?
sonney2k: ooooohhhhh I thought they are with C here
foulwall, I have seen people exploiting bugs like too long string in the browser url gives you a shell on the client
just give user a choice what dataset to embed
I think that's all
and some parameters
lisitsyn, sonney2k: Yes, that should be enough.
Have spent enough time exploiting web applications for this ;)
argh, I see.
thanks thoralf lisitsyn sonney2k , I'll make a simple one now
:)
gsomix, what do you need CReader for?
sonney2k, interface for readers. I have one more - StringReader where source for reading is SGVector.
gsomix, I don't see the relation to CFile?
I am just worried that CFile is already doing sth like that
sonney2k, there is no relation to CFile. CFile for access to file format classes, CReader for reading ascii data from sources.
*err CFile for access to file formats
sonney2k, CFile reads and writes whole vectors, matrices, lists.
gsomix, when you have an SGMatrix now and want to load it from a .csv
gsomix, how is it done then?
sonney2k: state on what?
buildbot/travis
sonney2k: unit tests green on my machine
fixed 100000000000 bugs
and now we have automated detection of many problems
also, we have a crude test of clone and equals
(works)
sonney2k: one thing I still would like to do is to do these tests on some data
currently, empty class instances do not contain any data
you know, fill the matrices etc
HeikoS, you're strong, man. :)
HeikoS, for what exactly? I mean you need a complete example for that to work?!
HeikoS, you will lead today right?
sonney2k: yes
lets discuss afterwards,
HeikoS, indeed unit tests work now
HeikoS: you are the hero!
HeikoS, congrats you are the hero of today!
* sonney2k sends over an ale!
yeah! :)
sonney2k: Hero? I mean that some of objects from CFile's class can use reader and writers.
so, CFile it's just interface + FILE*
I hope it's not too complex architecture :/
java style
well I am still confused :/
sonney2k, :C
sonney2k, what's not clear?
gsomix, I don't understand the benefit of these readers
I would have expected these scalar read to be in some class derived from CFile
and maybe on top of this some helpers
sonney2k, it is convenient. look, I have FileReader (for reading lines from csv) and StringReader (for reading/tokenizing primitive types from these lines) in CCSVFile
both FileReader and StringReader have CTokenize inside for tokenizing
result -> not many code, clear
thoralf, gsomix Mostly uninitialised memory, wrong get_name, wrong generic type 16:00 shogun: Heiko Strathmann :develop * 496e0aa / src/shogun/ (31 files): https://github.com/shogun-toolbox/shogun/commit/496e0aacf1bc6cc7cd0e7c76bf5ac9eaa98f5178 16:00 shogun: Merge pull request #1276 from karlnapf/develop 16:00 shogun: 16:00 shogun: fixed many more unit test segfaults and non-passes. 16:00 sonney2k, http://pastebin.com/pSbeTx75 16:00 line_reader is CFileReader object 16:01 reader is CStringReader object 16:01 so, reading line by line it's work for line_reader 16:01 reading of individual elements is work for reader 16:01 this code is part of get_vector method 16:02 in CCSVFile 16:02 gsomix, but CFileReader also reads int / bool etc? 16:04 so it is doing both? reading lines and parsing? or what is this for? 16:05 I mean I understand that you need a line reader and some parser of a line that you have read 16:05 sonney2k, yep, CFileReader can read data types 16:07 maybe it will be useful some day 16:08 sonney2k: I added set_generic_sgobject() 16:08 to CSGObject 16:08 sonney2k: but we totally should forbid T=SG* 16:08 thats evil 16:08 sonney2k, reading lines it's just reading substring with '\n' delimiter 16:08 *err, is 16:09 HeikoS, does     template<> void CSGObject::set_generic() work? 16:09 sonney2k: no 16:09 sonney2k, parsing is just reading line and atoi/strtol 16:09 only classname itself 16:09 HeikoS, you mean it needs the real class in there? 16:09 yes 16:10 HeikoS, interesting. Didn't know that 16:10 sonney2k: ah one issu 16:10 using set_generic_sgobject is not good I just realise 16:10 since its only possible to use CSGObject classes as template argument then 16:10 gsomix, maybe it is better to have a LineReader class that really just reads line by line and then a ParseLine class with your Reader Interface 16:11 sonney2k: so maybe add all cases we need with the above way then? 16:11 HeikoS, wait I mean we only want to support uint8_t ... complex* and CSGObject* right? 16:11 yes 16:11 ah shit 16:11 so we have to do the one by one thing 16:12 HeikoS, ??? 16:12 sonney2k: we have to do template<> void CSGObject::set_generic() 16:12 build #1504 of deb1 - libshogun is complete: Failure [failed test]  Build details are at http://www.shogun-toolbox.org/buildbot/builders/deb1%20-%20libshogun/builds/1504  blamelist: Heiko Strathmann 16:12 for each class 16:12 HeikoS, why that? 16:13 sonney2k: what other way? 16:13 is there? 16:13 HeikoS, what is wrong with your set_generic_sgobject? 16:13 or *_object 16:13 sonney2k: imagine CSet 16:13 you want to use that on PT_BOOL 16:13 and also on PT_SGOBJECT 16:13 all it does is just set         m_generic = PT_SGOBJECT; 16:13 if I just call set_generic() I get a problem with SGOBJECT 16:14 -!- pickle27 [~Kevin@d67-193-243-174.home3.cgocable.net] has joined #shogun 16:14 HeikoS, not in the same CSet though 16:14 if I call set_generic_sgobject() I get a problem with PT_BOOL 16:14 sonney2k, why? then LineReader is just FileReader w/o read_int, read_readl, etc methods. 16:14 I see 16:14 sonney2k: I have to call it in the constructor right? 16:14 I just can rename it to StreamReader, for example 16:14 you don't know the type is what you mean 16:14 where I dont know whether T is a class or a primitive type 16:14 not so confusing 16:14 sonney2k: yes 16:15 HeikoS, argh! 16:15 * gsomix afk 16:15 but we did this in the CDynamicObjectArray 16:15 sonney2k: there we said that T has to be a subclass ob CSGOBject 16:15 that is how it should be done in my eyes 16:16 gsomix, but then just drop the CFileReader and keep the parser I mean why do you need the CFileReader then? 16:16 or add one by one 16:16 sonney2k: what do you think? 16:16 HeikoS, true true 16:16 that is why we did it that way for DynamicObjectArray 16:16 so we would need CSet and CObjectSet *shrug* 16:17 sonney2k: yep 16:17 etc 16:17 build #1505 of deb1 - libshogun is complete: Failure [failed test]  Build details are at http://www.shogun-toolbox.org/buildbot/builders/deb1%20-%20libshogun/builds/1505  blamelist: Heiko Strathmann 16:17 there are three classes 16:17 that currently fail the tests here 16:17 which are cause by this problem 16:17 HeikoS, actually if CSet etc is calling SG_REF it should be named like it right? 16:18 same for map etc 16:18 sonney2k: does it do that? 16:18 lets see 16:18 no 16:18 so it would leak memory with CSGObject types anyways 16:18 sonney2k: ouch 16:18 no it does not 16:18 and it hashes the objects 16:19 so it hashes pointers to SGObjects 16:19 HeikoS, yeah but just consider someone putting in objects 16:19 these might be gone 16:19 if not manually REF'd 16:19 sonney2k: so no SGObject for CSet 16:19 exactly 16:19 same for CMap 16:19 ok then 16:20 argh 16:20 so much stuff 16:20 sonney2k: dynamicobject array is not even generic 16:21 sonney2k: so can you de-activate a few examples for me? 16:26 sonney2k: Set, ParseBuffer, TreeMachine 16:26 shogun: Heiko Strathmann :develop * d53e415 / src/shogun/io/MemoryMappedFile.h,src/shogun/io/SimpleFile.h: https://github.com/shogun-toolbox/shogun/commit/d53e4151e553a7d786382d31ab08132f496facca 16:29 shogun: fixed generics to pass unit-tests 16:29 shogun: Heiko Strathmann :develop * c48a483 / src/shogun/io/MemoryMappedFile.h,src/shogun/io/SimpleFile.h: https://github.com/shogun-toolbox/shogun/commit/c48a4836a16a5c5ba991ca81bbd87e99dc1f6544 16:29 shogun: Merge pull request #1277 from karlnapf/develop 16:29 shogun: 16:29 shogun: fixed generics to pass unit-tests 16:29 sonney2k: when you have time can you check this : https://github.com/van51/shogun/commit/a46a05f62d0bc6779b20c87e6c41e22819ad92a5 ? 16:29 sonney2k: all other unit tests work now 16:29 just those three classes are failing due to the above issues 16:30 -!- foulwall [~user@2001:da8:215:c252:2d09:69b3:4be7:def2] has joined #shogun 16:30 HeikoS, you mean in the autgenerated tests right? 16:31 sry gtg 16:31 hike time 16:31 sonney2k, for the future. I mean one time we will need reading data types directly from file stream 16:31 build #1506 of deb1 - libshogun is complete: Failure [failed test]  Build details are at http://www.shogun-toolbox.org/buildbot/builders/deb1%20-%20libshogun/builds/1506  blamelist: Heiko Strathmann 16:33 HeikoS: sending a PR for CG_M 16:37 HeikoS: Is there a way of checking how many references are still alive in shogun? 16:46 HeikoS: I'd like to add an assertion in one of my tests... ;) 16:46 build #1507 of deb1 - libshogun is complete: Failure [failed test]  Build details are at http://www.shogun-toolbox.org/buildbot/builders/deb1%20-%20libshogun/builds/1507  blamelist: Heiko Strathmann 16:47 -!- pickle27 [~Kevin@d67-193-243-174.home3.cgocable.net] has quit [Quit: Leaving] 16:56 -!- nube [~rho@36.252.254.22] has joined #shogun 17:03 -!- nube [~rho@36.252.254.22] has quit [Read error: Connection reset by peer] 17:08 -!- nube [~rho@49.126.160.86] has joined #shogun 17:09 lisitsyn:  around? 17:15 lambday: checking :) 17:15 thoralf: that is hard to to, but you can enable trace memory allocation to do this. It is not always stable though 17:15 thoralf: please dont add assertions to your tests 17:16 thoralf: we rather want to valgrind them at some point 17:16 sonney2k, wiking, lisitsyn, HeikoS: I have an old version of SWIG and SWIG 2. something in a machine, but the configure script detects the old one and fails 17:17 HeikoS: No worries.  Its just for personal use. 17:17 any way of telling it I want to use swig2.0? 17:17 from the command line I can do without problems swig and swig2.0 17:18 versions 1.3.29 and 2.0.4, respectively 17:18 iglesiasg: no idea 17:18 thoralf: so configure with --enable-trace-mallocs 17:18 then shogun_exit() will tell you how many objects there are and where 17:19 thoralf: its probably possible to call this function alone 17:19 HeikoS: That's great.  Thanks. 17:19 HeikoS: travis build didn't start yet :-/ 17:21 -!- nube [~rho@49.126.160.86] has quit [Ping timeout: 268 seconds] 17:21 HeikoS: let me know if you agree to the additional interface I added to fit cg_m into CLinearSolver 17:22 HeikoS: the problem was, current template structure allowed to return only real vector for solve method if we give it real operator and real vector.. 17:23 HeikoS: but we have additional complex shifts/weights which makes the solution vector complex... 17:23 HeikoS: now converting the operator to complex each time would have been bad I think 17:23 HeikoS, it was just --swig=swig2.0 finally 17:25 iglesiasg: I see :) 17:25 HeikoS: so this new interface is flexible, as in, it returns complex solution, if we give it real operator, real vector but complex shifts 17:25 -!- nube [~rho@36.252.236.22] has joined #shogun 17:26 lambday: btw I just realised these double generics will cause problems 17:26 HeikoS: problem with? 17:26 HeikoS, I was creating an alias and putting it in .bashrc and at the end I just grep the configure file to see how the swig check is done and saw the option by chance :D 17:26 serialisation, clone etc will not be possible 17:26 lambday: this set_generic stuff 17:26 HeikoS: but for class member vars, it only uses one generic for whatever I have added 17:26 lambday: ah I see 17:27 HeikoS: other type is for methods.. that doesn't store anything 17:27 so set_generic can work, right? 17:27 but the instances themselves? 17:27 also how about python, will they be accessible form outside 17:27 since for python, one has to manually fix the template parameters 17:27 -!- Cheng [~yaaic@ip-109-45-0-78.web.vodafone.de] has joined #shogun 17:28 HeikoS: which classes will we expose to python modular interface?? 17:28 HeikoS: most of the things are internal 17:28 lambday: we can choose 17:28 lambday: ok then 17:28 then everything is fine :) 17:28 HeikoS: I am not sure :( we'll have to see :( 17:28 Just checking that I can log in. Will be back later. 17:28 -!- Cheng [~yaaic@ip-109-45-0-78.web.vodafone.de] has quit [Client Quit] 17:28 Cheng: :) 17:28 HeikoS: but since class members are of same type, I don't think we'd face problem if we set generic with that type, no? 17:29 lambday: what do you mean? 17:29 shogun: root :develop * 5fc0bd0 / / (18 files): https://github.com/shogun-toolbox/shogun/commit/5fc0bd08aca6b9e4a253dc704b8e0f8c5706e1de 17:29 shogun: cg_m added, template structure changed for linear solver 17:29 shogun: Heiko Strathmann :develop * 0767425 / / (18 files): https://github.com/shogun-toolbox/shogun/commit/0767425659ae6ef698efc2322c60ad23ef9e1a32 17:29 shogun: Merge pull request #1278 from lambday/feature/log_determinant 17:29 shogun: 17:29 shogun: cg_m added, template structure changed for linear solver 17:29 and mostly generics are abstract... final subclasses are mostly non-generic 17:30 HeikoS: say, I used template class A... but its members all use T, one of its methods takes an argument may be which s ST 17:30 and also, ST default is T 17:31 HeikoS: will that solve the problem with python? 17:31 Cheng, got your email and replying. 17:31 by "members all use T" I meant attributes... 17:32 lambday: sorry re 17:38 lambday: I see 17:38 so why have this double template if you dont use it 17:38 HeikoS: say, CIterativeLinearSolver is double template.. which has abstract solve(CLinearOperator, SGVector), return type is SGVector 17:41 lambday: ok 17:41 HeikoS: now, one of its subclass, CConjugateGradientSolver uses T=ST=float64_t 17:41 so this class can never be serialised nor visible to python 17:41 lambday: but it is abstract right? 17:41 HeikoS: and another subclass CConjugateOrthogonalCG uses T=complex64_t, ST=float64_t 17:42 so the subclass is not even generic 17:42 yes 17:42 lambday: great 17:42 lambday: great job 17:42 great latest patch also btw 17:42 HeikoS: is it the right way to do it? 17:42 thanks :) 17:42 lambday: could you pls run the unit tests locally and tell me what happens? 17:42 HeikoS: the only place the final child classes are double generic is CLinearOperator 17:43 build #1508 of deb1 - libshogun is complete: Failure [failed test]  Build details are at http://www.shogun-toolbox.org/buildbot/builders/deb1%20-%20libshogun/builds/1508  blamelist: root 17:43 HeikoS: all of them? 17:43 lambday: yes 17:43 lambday: the linear operator is a class member though right? 17:43 HeikoS: alright... checking 17:43 HeikoS: yes.. but there also the matrix is of T type, so set generic uses T, but apply is like SGVector CLinearOperator::apply(SGVector vector); 17:44 -!- van51 [~van51@athedsl-399972.home.otenet.gr] has quit [Quit: Leaving.] 17:45 HeikoS: I should do a git pull again before running the tests 17:45 yes do 17:46 lambday: I didnt get that 17:46 build #1509 of deb1 - libshogun is complete: Failure [failed test]  Build details are at http://www.shogun-toolbox.org/buildbot/builders/deb1%20-%20libshogun/builds/1509  blamelist: Heiko Strathmann 17:48 HeikoS: https://github.com/shogun-toolbox/shogun/blob/develop/src/shogun/mathematics/logdet/DenseMatrixOperator.cpp 17:49 this is what I used 17:49 its a double template class, but set generic sets T as the type 17:50 just its apply method, which is no more abstract, takes a vector of type ST as an argument, and returns another vector of type T as an argument 17:50 oops 17:50 what does template class CDenseMatrixOperator; create? 17:50 ignore the last "as an argument" 17:50 HeikoS: T=ST=complex64_t 17:51 okay, those should be fine if set_generic() was called 17:51 since default for ST is T 17:51 but what about template class CDenseMatrixOperator; 17:51 that calls set_generic (since the matrix inside it is of complex type) 17:52 lambday: so it should not cause any problems 17:52 wow, thats actually quite nice 17:52 but apply can take a real vector and that complex operator can be applied and returns a complex vector 17:52 yep I get it 17:52 cool 17:52 very nice 17:52 thanks :) 17:52 HeikoS: CG_M unit-tests show similar performance with normal CG for those small unit-tests 17:53 lambday: I mean at the end, people will not necessarily use all those internals but just the logdet approx class which has some default settings 17:53 lambday: ok good! 17:53 HeikoS: yes that's what I thought 17:53 lambday: how about the unit tests? 17:53 HeikoS: well, since these CG solvers of ours can work with dense matrix too.. so I thought of checking its performance before I move on to Lanczos 17:54 HeikoS: oh checking :D 17:54 -!- travis-ci [~travis-ci@ec2-54-224-203-225.compute-1.amazonaws.com] has joined #shogun 17:54 [travis-ci] it's Heiko Strathmann's turn to pay the next round of drinks for the massacre he caused in shogun-toolbox/shogun: http://travis-ci.org/shogun-toolbox/shogun/builds/9395154 17:54 -!- travis-ci [~travis-ci@ec2-54-224-203-225.compute-1.amazonaws.com] has left #shogun [] 17:54 lambday: ok 17:54 HeikoS: do you think it would be okay to use eigen3 objects as parameters of protected methods for a "C" class? 17:55 protected is fine yet 17:55 yes 17:55 HeikoS: I was afraid and used SGVectors everywhere 17:56 lambday: not optimal, but if that avoids re-computing sparse matrices, then yes do it 17:56 HeikoS: no its just the vectors, I don't think SGVector instead of eigen3::Vector will slow things down that much 17:56 HeikoS: the places I used didn't even need dot products or anything vector-specific - just accessing elements one by one.. so SG is fine I guess 17:57 HeikoS: non-technical question - you play base guitar? :D 17:58 HeikoS: btw unit-tests give segfaults for clone_equals_LineReader and stops :( 17:58 lambday:  no it wont slow things down 18:00 HeikoS: SGObject.clone_equals_SNPStringKernel too fails 18:00 lambday: not much overhead 18:00 I hope at least 18:00 lambday: no, but normal guitar 18:00 phew! 18:00 just a bit of base guitar 18:00 lambday: it depends how often this stuff is called, but we do not have to allocate new memory (which is slow) 18:00 HeikoS: just got back 18:01 lisitsyn: could you disable automagic unit tests for  Set, ParseBuffer, TreeMachine 18:02 HeikoS: yes.. for CG_M things, I used same memory whenever I could.. krylstat used unnecessary vectors at a few places, I didn't use them 18:02 lisitsyn:  these are highly non-trivial to fix 18:02 lambday: good! then we will be even faster :) 18:02 (and more stable, tested etc) 18:02 and interfaces 18:02 =world dominance 18:02 lol :D 18:03 but I am scared to see CG_M performing on that huge matrix :-s 18:03 shogun: Heiko Strathmann :develop * 8180b78 / src/shogun/features/PolyFeatures.cpp: https://github.com/shogun-toolbox/shogun/commit/8180b78d2d64521c75a15b27be9d8f5561b6085e 18:03 shogun: fixed an uninitialised memory issue 18:03 shogun: Heiko Strathmann :develop * a5e9b55 / src/shogun/io/LineReader.cpp: https://github.com/shogun-toolbox/shogun/commit/a5e9b55fd08b5b04540841fa93ee7b9a17a43d9b 18:03 shogun: fixed an uninitialised memory error 18:03 shogun: Heiko Strathmann :develop * 3d97b4d / src/shogun/features/PolyFeatures.cpp,src/shogun/io/LineReader.cpp: https://github.com/shogun-toolbox/shogun/commit/3d97b4d82220493dfe84ac504f16ff4262beb564 18:03 shogun: Merge pull request #1279 from karlnapf/develop 18:03 shogun: 18:03 shogun: more bugfixes 18:03 lambday: could you pull and run again? unit tests I mean 18:04 checking... 18:04 lambday: second half of GSoC will be tuning :) 18:04 compiling shogun melts my computer 18:05 HeikoS: let me try 18:05 Physical id 0:  +97.0 C  (high = +86.0 C, crit = +100.0 C) 18:05 Core 0:         +96.0 C  (high = +86.0 C, crit = +100.0 C) 18:05 Core 1:         +97.0 C  (high = +86.0 C, crit = +100.0 C) 18:05 whoa! :-o 18:05 lisitsyn: thanks! 18:05 HeikoS: that's why I use insti's stuffs :D 18:05 lisitsyn: that should hopefully then make unit tests green again 18:05 HeikoS: I think we just have to patch line 22 here 18:05 https://github.com/shogun-toolbox/shogun/blob/develop/tests/unit/base/clone_unittest.cc.py 18:05 see you later during the meeting guys 18:05 iglesiasg: see you man :) 18:06 lisitsyn: yes, cool, could you do that? 18:06 -!- iglesiasg [~iglesias@2001:6b0:1:1041:d38:c6f9:14dc:bce8] has quit [Quit: Ex-Chat] 18:06 iglesiasg: see you! 18:06 HeikoS: we gotta think a bit about that tridiagonal solver and then coloring.. :-/ 18:06 HeikoS: krylstat uses colpack 18:06 brb 18:06 lambday: yes indeed 18:07 build #1510 of deb1 - libshogun is complete: Failure [failed test]  Build details are at http://www.shogun-toolbox.org/buildbot/builders/deb1%20-%20libshogun/builds/1510  blamelist: Heiko Strathmann 18:08 -!- nube1 [~rho@36.252.202.109] has joined #shogun 18:08 -!- nube [~rho@36.252.236.22] has quit [Ping timeout: 248 seconds] 18:08 HeikoS: doesn't give segfaults now 18:09 many tests fail but that's okay I guess? 18:10 lambday: nice finally 18:10 lambday: should be only three 18:10 three groups 18:10 HeikoS: yes three groups 18:10 HeikoS: well, 4 18:10 shogun: Sergey Lisitsyn :develop * b60493b / tests/unit/base/clone_unittest.cc.py: https://github.com/shogun-toolbox/shogun/commit/b60493bf56223c61fb15d0cf89300ddd79f5f71d 18:10 shogun: Added clone test ignores 18:10 Set, SNPStringKernel, ParseBuffer and TreeMachine 18:11 HeikoS: this should work 18:11 ouh 18:11 SNPStringKernel too? 18:11 checking again 18:11 shogun: Heiko Strathmann :develop * b92a873 / src/shogun/ui/ (6 files): https://github.com/shogun-toolbox/shogun/commit/b92a873adefaf9743fba785fba25465201122608 18:11 shogun: Revert "fixed uninitialised memory bugs" 18:11 shogun: 18:11 shogun: This reverts commit ab0d3977de71d7f031dfc8026580c7e7a39707df. 18:11 shogun: Heiko Strathmann :develop * 56ed507 / src/shogun/ (11 files): https://github.com/shogun-toolbox/shogun/commit/56ed507d202e7e7b1fcfee2c9d2c29586fd81e4f 18:11 shogun: Revert "more uninitialised memory fixed" 18:11 shogun: 18:11 shogun: This reverts commit 783aa89bf8711c67a125218834c135e834178de0. 18:11 shogun: Heiko Strathmann :develop * 9974ffe / src/shogun/ (17 files): https://github.com/shogun-toolbox/shogun/commit/9974ffe9e6433059f583f679b4e2d3eaa5551b38 18:11 shogun: Merge pull request #1280 from karlnapf/develop 18:11 shogun: 18:11 shogun: undo changes to static interfaces 18:11 this should also fix the static examples 18:12 my logdet dir is getting huge and huge :D 18:13 may be I should separate things into separate folders, later :-/ 18:14 -!- nube [~rho@36.253.81.248] has joined #shogun 18:15 -!- nube1 [~rho@36.252.202.109] has quit [Read error: Connection reset by peer] 18:15 HeikoS: lambday: just put failing tests to that list 18:16 lisitsyn: NameError: global name 'true' is not defined 18:16 ahaa 18:16 ooh sorry :D 18:16 :D 18:16 that was blind fix 18:16 fixing 18:16 True :-/ 18:16 lambday: indeed 18:17 lambday: so what about 18:17 SNP think 18:17 should it be here too? 18:17 SNP too failed but just one of them... I 18:17 build #1512 of deb1 - libshogun is complete: Failure [failed test]  Build details are at http://www.shogun-toolbox.org/buildbot/builders/deb1%20-%20libshogun/builds/1512  blamelist: Heiko Strathmann 18:18 shogun: Sergey Lisitsyn :develop * 2c3da22 / tests/unit/base/clone_unittest.cc.py: https://github.com/shogun-toolbox/shogun/commit/2c3da22acf83b675d665fb9dde3878db1fa4cce7 18:18 shogun: Update clone_unittest.cc.py 18:18 I forgot which one :-/ 18:18 ok just put it there 18:18 lisitsyn, lambday let me know what the unit tests do on your system now 18:18 HeikoS: lisitsyn alright I am checking 18:19 build #1513 of deb1 - libshogun is complete: Failure [failed test]  Build details are at http://www.shogun-toolbox.org/buildbot/builders/deb1%20-%20libshogun/builds/1513  blamelist: Heiko Strathmann , Sergey Lisitsyn 18:20 haha 18:20 ok next build please 18:20 it blames and blames and never stops :( 18:21 -!- pickle27 [~Kevin@67.193.243.174] has joined #shogun 18:22 build #1514 of deb1 - libshogun is complete: Failure [failed test]  Build details are at http://www.shogun-toolbox.org/buildbot/builders/deb1%20-%20libshogun/builds/1514  blamelist: Sergey Lisitsyn 18:22 HeikoS: :( 18:22 HeikoS: segfault again SGObject.clone_equals_LatentSOSVM 18:22 lambday: really? 18:23 weird I solved that 18:23 build #1511 of deb1 - libshogun is complete: Failure [failed test]  Build details are at http://www.shogun-toolbox.org/buildbot/builders/deb1%20-%20libshogun/builds/1511  blamelist: Heiko Strathmann 18:23 python2.7 command not found 18:23 what 18:23 :D 18:23 lambday:  did you really pull? 18:23 lisitsyn: yes I had that before 18:23 HeikoS: yep 18:24 weiiiird 18:24 shogun takes so long to compile now 18:24 I'm checking again 18:24 HeikoS: it just happened on the buildbot btw 18:25 I mean latent so svm crash 18:25 lisitsyn: ah I see 18:25 so thats fine since I reverted some patches 18:25 next step will solve it 18:26 HeikoS: same result 18:29 HeikoS: lisitsyn: I'll be back in an hour.. going for dinner.... see you guys :) 18:29 lambday: meaning? 18:29 see you 18:29 HeikoS: segfault :( 18:29 enjoy! 18:29 -!- lambday [67157d36@gateway/web/freenode/ip.103.21.125.54] has quit [] 18:29 lisitsyn: did not work 18:29 HeikoS: ignore? 18:30 the three tests are still in there 18:30 damn 18:30 lisitsyn: yes, could you not cold fix that ;) takes ages to re-compile here 18:30 I was sure it is enough 18:30 lisitsyn: 1170 tests pretty good 18:31 -!- votjakovr [~votjakovr@host-46-241-3-209.bbcustomer.zsttk.net] has joined #shogun 18:31 hey votjakovr how are you 18:31 HeikoS: I can see it should not go through - weird 18:31 lisitsyn: hi! i'm not so good, but never mind. And you? 18:35 votjakovr: what's happening? 18:35 I am ok 18:36 lisitsyn: good :) sorry, can't talk about that (too many problems falling on my head, but i'll fight them) 18:39 oh that's ok hope you will be good soon 18:40 lisitsyn: thanks 18:40 Did anyone break develop? 18:43 perl_modular and r_modular took ages to compile, but failed in the end. 18:44 Haven't tried a clean checkout yet. 18:44 thoralf: what's the error? 18:45 src/interfaces/r_modular/sg_print_functions.cpp:36: undefined reference to Rprintf' 18:46 Many errors, but that's the last one. 18:46 thoralf: I had the same problem 18:46 thoralf: never got it fixed 18:46 was hoping it would resolve itself on its own soon 18:46 pickle27: Oh, I'm not alone.  Good. 18:47 :) 18:47 lisitsyn: I just sent a mail regarding the bug in Jade 18:47 pickle27: yeah looking into it 18:48 -!- nube1 [~rho@36.253.102.30] has joined #shogun 18:48 -!- nube [~rho@36.253.81.248] has quit [Ping timeout: 264 seconds] 18:48 HeikoS: hi! i've received your email, i'll do that! Just a questions: Am i right that you want to sample from posterior approximation q(f | X, y) = N(f^, (K^(-1)+W )^(-1)) and need to evaluate mean (f^) and covariance ((K^(-1)+W )^(-1)) for that? 18:49 pickle27: so sign problem is here still? 18:49 yeah but its weirder than I thought yesterday 18:50 its only when I build it as part of shogun 18:50 my standalone working file works... 18:50 -!- travis-ci [~travis-ci@ec2-23-20-210-220.compute-1.amazonaws.com] has joined #shogun 18:51 [travis-ci] it's Heiko Strathmann's turn to pay the next round of drinks for the massacre he caused in shogun-toolbox/shogun: http://travis-ci.org/shogun-toolbox/shogun/builds/9397270 18:51 -!- travis-ci [~travis-ci@ec2-23-20-210-220.compute-1.amazonaws.com] has left #shogun [] 18:51 pickle27: haha cool 18:54 pickle27: the same code? 18:54 yup 18:55 only difference is that the in shogun the function takes densefeatures and then i get the matrix 18:55 pickle27: can't say much about that - you've got to ensure the data is the same probably 18:56 it essentially is, thats what I spent all of yesterday doing 18:57 you can see by the cov matrix which is virtually the same 18:57 now in that particular case it isn't the exact data just data generated using the same code 18:57 but I have checked with the exact data and its the same problem 18:57 votjakovr: hey! 19:10 votjakovr: good to see you  finally :) 19:10 votjakovr: yes exactly this is what I want to do, but I would like to keep it in a general form so that we can easily extend it for the EP 19:10 votjakovr: also for the covariance, we should use the matrix inversion lemma form where we do only have to invert B, which is already available in the implementation 19:11 -!- lisitsyn [~lisitsin@mxs.kg.ru] has quit [Quit: Leaving.] 19:12 -!- nube [~rho@36.253.205.226] has joined #shogun 19:12 -!- nube1 [~rho@36.253.102.30] has quit [Ping timeout: 246 seconds] 19:12 HeikoS: Ok, i'll do it. 19:13 -!- nube1 [~rho@49.244.72.22] has joined #shogun 19:14 HeikoS, just got your mail. I'll fix it. Thanks! 19:16 gsomix: cool thanks! I already fixed some uninitialised variable bugs 19:17 -!- nube [~rho@36.253.205.226] has quit [Ping timeout: 240 seconds] 19:17 votjakovr: so something like get_posterior_approximation_mean/cov which returns a vector and covariance matrix 19:17 and this as an optional method of CINferenceMethod which is then overloaded in LaplacianInference and EPInference 19:17 votjakovr: if you could do that next, this would be extremely useful 19:18 since I currently need it for my research :D 19:18 and my python code is so low 19:18 and next thing should be I think getting the logit classifier to work 19:18 HeikoS: ok, i'll do it :) 19:20 votjakovr: nice :) let me know how it goes 19:20 shogun: Heiko Strathmann :develop * 3b1cddc / src/shogun/latent/LatentSOSVM.cpp: https://github.com/shogun-toolbox/shogun/commit/3b1cddc50f472498bbefb328a70bd60371c9b5c7 19:27 shogun: fix uninitialised memory 19:27 shogun: Heiko Strathmann :develop * 1b8ed95 / tests/unit/base/clone_unittest.cc.py: https://github.com/shogun-toolbox/shogun/commit/1b8ed9502778090c0180aa0803b2bec4e41e2a43 19:27 shogun: classlist does not return class names starting with C 19:27 shogun: Heiko Strathmann :develop * 025a9a4 / src/shogun/latent/LatentSOSVM.cpp,tests/unit/base/clone_unittest.cc.py: https://github.com/shogun-toolbox/shogun/commit/025a9a48545b6726f7624af892d230079afc895d 19:27 shogun: Merge pull request #1281 from karlnapf/develop 19:27 shogun: 19:27 shogun: unit tests green again 19:27 build #1516 of deb1 - libshogun is complete: Failure [failed test]  Build details are at http://www.shogun-toolbox.org/buildbot/builders/deb1%20-%20libshogun/builds/1516  blamelist: Heiko Strathmann 19:33 btw i'd like to ask another question: i evaluate integral simultaneously on few intervals, i have a method: evaluate_quadgk which should return approximate value of integral and error for each interval, both of them are vectors. How best to do it? I mean return two vectors. Create class (structure) for it, or something else? This method is private. 19:34 HeikoS: ^ 19:35 -!- travis-ci [~travis-ci@ec2-54-224-203-225.compute-1.amazonaws.com] has joined #shogun 19:40 [travis-ci] it's Heiko Strathmann's turn to pay the next round of drinks for the massacre he caused in shogun-toolbox/shogun: http://travis-ci.org/shogun-toolbox/shogun/builds/9398719 19:40 -!- travis-ci [~travis-ci@ec2-54-224-203-225.compute-1.amazonaws.com] has left #shogun [] 19:40 whaaaat? 19:46 stupid unit tests 19:46 votjakovr: let me think 19:46 votjakovr: I think the best way would be to pass pre-allocated vectors as references 19:47 build #1515 of deb1 - libshogun is complete: Failure [failed test]  Build details are at http://www.shogun-toolbox.org/buildbot/builders/deb1%20-%20libshogun/builds/1515  blamelist: Heiko Strathmann 19:48 sonney2k: unit tests are green on my machine 19:48 sonney2k: buildbot says: Generating base/clone_unittest.cc make[1]: python2.7: Command not found 19:48 HeikoS: Ok, like void evaluate_quadgk(SGVector &vals, SGVector &errs, ...) ? 19:49 votjakovr: yes exactly 19:49 votjakovr: with assertions that the vectors are either empty (then they are allocated by the method) or have the correct size 19:50 HeikoS: Ok, i'll do so 19:53 -!- lisitsyn [~lisitsyn@213.87.128.75] has joined #shogun 19:54 lisitsyn: could you check the unit tests on your machine? 19:58 lisitsyn: I think they are green now, I removed the "C" in the class name to make things work in the python script 19:58 HeikoS: now yes 19:58 meeting is in 1 hour yes? 19:59 pickle27: yes 19:59 cool cool 19:59 be back then! 19:59 -!- pickle27 [~Kevin@67.193.243.174] has quit [Quit: Leaving] 19:59 HeikoS: ooooohhhhh I thought they are with C here 19:59 -!- az_de [57a25d66@gateway/web/freenode/ip.87.162.93.102] has joined #shogun 20:00 -!- lisitsyn [~lisitsyn@213.87.128.75] has quit [Ping timeout: 256 seconds] 20:04 -!- az_de [57a25d66@gateway/web/freenode/ip.87.162.93.102] has quit [Quit: I'll be back for the meeting] 20:13 -!- lisitsyn [~lisitsyn@213.87.128.75] has joined #shogun 20:17 -!- travis-ci [~travis-ci@ec2-54-224-203-225.compute-1.amazonaws.com] has joined #shogun 20:20 [travis-ci] it's Sergey Lisitsyn's turn to pay the next round of drinks for the massacre he caused in shogun-toolbox/shogun: http://travis-ci.org/shogun-toolbox/shogun/builds/9398938 20:20 -!- travis-ci [~travis-ci@ec2-54-224-203-225.compute-1.amazonaws.com] has left #shogun [] 20:20 -!- van51 [~van51@athedsl-399972.home.otenet.gr] has joined #shogun 20:40 -!- pickle27 [~Kevin@130.15.32.52] has joined #shogun 20:42 -!- iglesiasg [~Fernando@s83-179-44-135.cust.tele2.se] has joined #shogun 20:45 -!- mode/#shogun [+o iglesiasg] by ChanServ 20:45 greetings 20:46 hey! 20:47 HeikoS, so what is the state? 20:47 the state of the art 20:47 sonney2k: state on what? 20:47 buildbot/travis 20:48 sonney2k: unit tests green on my machine 20:49 fixed 100000000000 bugs 20:50 and now we have automated detection of many problems 20:50 also, we have a crude test of clone and equals 20:50 (works) 20:50 sonney2k: one thing I still would like to do is to do these tests on some data 20:50 currently, empty class instances do not contain any data 20:50 you know, fill the matrices etc 20:50 HeikoS, you're strong, man. :) 20:51 HeikoS, for what exactly? I mean you need a complete example for that to work?! 20:51 HeikoS, you will lead today right? 20:51 sonney2k: yes 20:51 lets discuss afterwards, 20:52 HeikoS, indeed unit tests work now 20:52 HeikoS: you are the hero! 20:52 HeikoS, congrats you are the hero of today! 20:52 * sonney2k sends over an ale! 20:53 yeah! :) 20:53 sonney2k: Hero? Even if he broke it? ;) 20:55 HeikoS: Sorry. ;) 20:55 thoralf, about your memleak was this for SGObjects or other memory? 20:56 I mean obj->refcount() gets you the count... 20:56 thoralf, and hey he fixed much more than he broke :D 20:56 I am the one who broke more than fixed 20:57 lisitsyn, true or should I say True :P 20:57 sonney2k: I don't feel the difference you know 20:57 tRuE tRUE TrUE! 20:58 sonney2k: I wanted to see if *all* objects got freed.  (Guessing that you're referring to "Is there a way of checking how many references are still alive in shogun?") 20:58 what is this story about true? 20:58 sonney2k: But HeikoS already told me how to do. :) 20:58 haha yeah I want to hear this lol 20:58 thoralf: enable-ref-count when you configure helps I think 20:58 -!- nube [~rho@36.253.139.165] has joined #shogun 20:58 -!- az_de [57a25d66@gateway/web/freenode/ip.87.162.93.102] has joined #shogun 20:58 iglesiasg: I am true-blind 20:59 thoralf, yeah --trace-mallocs is your friend 20:59 or something like that 20:59 thoralf, you can at any time do a memory print 20:59 Yes.  Thank you all!!one! :) 20:59 iglesiasg, that is always on 20:59 iglesiasg: some people are color-blind I am true-blind 21:00 sonney2k: ah! ok, sorry 21:00 sonney2k: Memory print?  That one is new... elaborate please? 21:00 thoralf, gsomix, and me did spend quite some time on this 21:00 lisitsyn: hehe but what happened? 21:00 iglesiasg: I mixed up true and True 21:00 who is still missing? 21:00 thoralf, list_memory_allocs() 21:00 lisitsyn: python or so? 21:00 iglesiasg: yes 21:00 lamday 21:00 HeikoS: I am compiling stuff I'll report tomorrow ;) 21:01 sonney2k: Ah, already saw this one.  It's related to trace-mallocs as well, right? 21:01 we should do something with this 21:01 HeikoS: lambday is around? 21:01 not yet 21:01 I know georg wont make it 21:01 I don't see anyone else missing apart from him 21:01 no, I don't think so either 21:01 sonney2k: It's what exit_shogun() does, I guess. 21:01 patrick also wont make it 21:02 -!- nube1 [~rho@49.244.72.22] has quit [Ping timeout: 248 seconds] 21:02 van51, will olivier/benoit make it? 21:02 hushell, lambday 21:02 sonney2k: I don't know tbh 21:02 or quoc? I guess not 21:02 thoralf: yes exit_shogun does it 21:02 thoralf: so you will see hanging refs 21:02 ok lets wait 2 more minutes 21:02 I spoke with Georg this morning anyway, so we are up to date. He also mentioned he will surely make the mid-term form 21:02 otherwise they can use logs 21:02 -!- lambday [67157d36@gateway/web/freenode/ip.103.21.125.54] has joined #shogun 21:03 hi all 21:03 haha cool 21:03 we have hardcoded 21:03 631 21:03 hello hello 21:03 in base/init.h:61 21:03 lambday: hi 21:03 that's the number of classes 21:03 sorry I am a bit late :( 21:03 dont worry 21:03 -!- foulwall` [~user@2001:da8:215:c252:2d09:69b3:4be7:def2] has quit [Remote host closed the connection] 21:04 -!- foulwall [~user@2001:da8:215:c252:2d09:69b3:4be7:def2] has joined #shogun 21:04 lisitsyn: Ugh. ;) 21:05 thoralf: madskillz I guessed the meaning of 631 21:05 lisitsyn: I wish you weren't right. ;) 21:06 HeikoS: just checked your mail.. this is for all of our CG solver for a high default condition for convergence 21:07 lambday: I see, should be remove then 21:07 ok lets start 21:07 the rest can read logs 21:07 Welcome all to the our GSoC meeting before mid-term. 21:08 The plan for today is: 21:08 1.) Tell mentors/students what to do for mid-term 21:08 2.) Hear about the general progress over every group 21:08 3.) Talk about the "big" examples 21:08 Any other points that I missed? Please shout! 21:08 So let's start with mid-term 21:08 The evaluation forms open at July, 29 21:08 Every mentor and every student has to fill one. You can the forms them through your google melange page, http://www.google-melange.com/ 21:08 Filling the form takes about 15 minutes, so not a big deal. 21:08 Mentors have to make students succeed/fail. 21:09 (Mentors: if you are thinking of failing you student, please talk to me, sonney2k, or lisitsyn. But please let us try to avoid that) 21:09 Students have to give some information on how much work they invested yet etc 21:09 Hard deadline is on August 2. Please do not postpone this, fill them as soon as possible - this makes us more relaxed :) 21:09 Students: Push your mentors to fill in the form, you won't get money otherwise. 21:09 Questions? 21:09 sonney2k, lisitsyn comments? 21:09 no that's ok 21:10 van51, pickle27, iglesiasg so push your mentors once its july29 :) 21:10 yes do it ASAP lets say hard deadline july30 21:10 okay! 21:10 so we can hotfix stuff if 21:10 so does az_de send mine in or lisitsyn ? 21:11 yes no pracastination 21:11 main mentor 21:11 pickle27, az_de should do it 21:11 kk 21:11 pickle27: az_de 21:11 the main mentor 21:11 I'll do it 21:11 HeikoS: will do, thanks for the suggestion 21:11 HeikoS: sonney2k: so who do I push? 21:11 if not possible let us know 21:11 and we will do it 21:11 az_de, thanks! 21:11 Olivier? 21:11 van51, as you wish :D 21:11 it takes 1 minute so I don't care 21:11 yeah 15 minutes is a bit overestimate 21:12 yes 21:12 or /mind 21:12 Who will I push, sonney2k or cheng? 21:12 foulwall: sonney2k 21:12 foulwall, yeah me 21:12 ok sonney2k 21:12 I will push sonney2k too :) 21:12 and myself  ;) 21:12 we all push sonney2k all the time 21:12 I start to feel even smaller now 21:12 sonney2k, push? hugs! 21:13 Ok then, lets continue? 21:13 Could every group give a short (!) summary of recent work and future plans? 21:13 Who wants to start? 21:13 sonney2k: we will pop you once 21:13 HeikoS, yeah please continue before we start a hug-fest 21:13 ok then, van51 would you like to start? 21:13 ok 21:13 other, pls prepare some text already to make this faster 21:14 well, support for hashing has been added for text collections 21:14 and also for the dense and sparse features 21:14 and also a nice comparison has been done in a webspam dataset that shows the nice speedup gained and that robustness was maintained 21:15 -!- Cheng [~yaaic@ip-109-45-0-25.web.vodafone.de] has joined #shogun 21:15 van51: nice, is this available? 21:15 on that project what remains is mostly to add support for quadratic features 21:15 HeikoS: I will make it available somewhere 21:16 cool, maybe this fits into the last point today 21:16 HeikoS: right now the results are on some output files, I plan on combining them to make it more informatory 21:16 van51: finished? 21:16 who wants next? 21:16 I can go next! 21:17 iglesiasg: please go ahead 21:17 The implementation of LMNN is finished now. 21:17 -!- nube [~rho@36.253.139.165] has quit [Ping timeout: 246 seconds] 21:17 This is useful to find automatically a distance that maximizes the accuracy of multiclass classification, instead of using a particular given distance (typically Euclidean). 21:17 There are nonetheless (many) things to improve and possible extensions; for instance I am currently improving some parts that are rather slow. 21:17 -!- nube [~rho@36.252.193.54] has joined #shogun 21:18 iglesiasg: what are you comparing against? 21:18 I just wonder (sorry to interrupt) - can that be used to estimate what distance is the best? 21:18 lisitsyn: lets do that afterwards 21:18 lisitsyn: it finds the best distance 21:18 HeikoS: do what? 21:19 discuss :) 21:19 ahh well ok 21:19 iglesiasg: next steps? 21:19 HeikoS: against the original implementation by LMNN's author 21:20 HeikoS: it is in Matlab, my implementation is slow because they use some heuristics 21:20 We want also to compare the performance of LMNN with other multiclass classification techniques implemented in Shogun. At first we are going to use the MNIST dataset, and afterwards some metagenomics datasets Georg is gathering. 21:20 HeikoS: short-time, make these comparisons with other multiclass classification, and benchmark agains the original implementation once mine gets faster 21:20 HeikoS: after, add some extensions. For instance, use LMNN for dimension reduction, enforce some constraints in the metric learnt, etc 21:21 iglesiasg: ah nice, useful, maybe also good to keep the codes for illustration/examples later 21:21 ok cool, gsomix would you like to continue? 21:21 HeikoS, yep 21:21 thanks iglesiasg 21:22 ok, some simple I/O system for SHOGUN is almost done. there are reading and writing tools and preliminary versions of classes that works with csv and libsvm files. 21:22 HeikoS: good idea 21:22 there are needed some cool features for csv, for example 21:22 I'm discussing some architecture aspects with Soeren. so, before mid-term it will be available for all - we should merge my big messy code. :3 21:22 gsomix: nice, I will probably use the csv stuff! and after mid-term? 21:23 me too! 21:23 HeikoS, yep. it will be completely done 21:23 next we plan work with protobuf format, matlab's m-files and so on what contains in my proposal 21:23 I think sonney2k will correct me. right? 21:23 hehe 21:24 gsomix: awesome matlab files! 21:24 alright, thanks gsomix 21:24 pickle27: could you be next? 21:24 sure! 21:24 First of all 21:24 reading matlab files from C++ easily will be awesome 21:25 all of the Aproximate Joint Diagonalization (AJD) techniques from the R package have been ported to c++ 21:25 the last 2 still need to be push to shogun though 21:25 I've also code 3 ICA techniques SOBI, JADE and FFSep 21:25 Im chasing down a strange bug in Jade but its pretty close 21:25 the other 2 techniques seem to work well, FFSep still needs to be pushed 21:26 I have a nice signal example for python and one for matlab as well that I need to push shortly 21:26 pickle27: nice, looking forward to look at that 21:26 I'd also like to do a R example but I can't get the interface to build 21:26 whats next for me 21:26 is the audio example which Im starting on this week! 21:27 pickle27, tell me what does not work later it builds fine here and on the buildbots 21:27 will do! 21:27 thoralf, was having the same problem with the R interface 21:27 pickle27: cool, audio example is something we dont have yet :) 21:27 HeikoS, as notebook! 21:27 ha 21:27 btw why not 21:27 R interface - we really should get a hacker to solve the modular one 21:27 pickle27: what about doing it with ipython notebook? 21:28 sonney2k: I was going to says that, notebooks support audio 21:28 sorry what is notebook 21:28 lisitsyn, pickle27 everyone should do an ipython notebook 21:28 I was going to do it in python though 21:28 pickle27: I will explain later 21:28 last point for today 21:28 pickle27, HeikoS: Yeah. 21:28 but first summaries 21:28 pickle27: thanks, 21:28 votjakovr: you want to be next? 21:28 pickle27: http://nbviewer.ipython.org/url/jakevdp.github.com/downloads/notebooks/XKCD_plots.ipynb like that 21:28 HeikoS: sorry was it decided already? 21:29 HeikoS: ok 21:29 about notebooks 21:29 i've finished probit classifier, numerical integration stuff, logit and expectation propagation (EP) classifiers will be avaliable before mid-term. Next i'd like to review/debug model selection framework, because it's one of key part of GPs 21:29 lisitsyn: yeah, every gsoc project will have to do one, but more on that later 21:29 HeikoS: oops I missed that 21:29 -!- travis-ci [~travis-ci@ec2-54-224-203-225.compute-1.amazonaws.com] has joined #shogun 21:30 [travis-ci] it's Heiko Strathmann's turn to pay the next round of drinks for the massacre he caused in shogun-toolbox/shogun: http://travis-ci.org/shogun-toolbox/shogun/builds/9398968 21:30 -!- travis-ci [~travis-ci@ec2-54-224-203-225.compute-1.amazonaws.com] has left #shogun [] 21:30 votjakovr: true  that wil be the next step after 21:30 votjakovr: so the other stuff is almost done? 21:30 And after that, i'll plan to work on multiclass classification 21:31 lisitsyn, well you chickened out 21:31 votjakovr: cool, thanks! 21:31 hushell is not here 21:31 so lambday is next :) 21:31 sonney2k: buck buck buck 21:31 lisitsyn:  tztztz chatty as before ;) 21:32 HeikoS: alright 21:32 The main goal of our project is to estimate log determinant of a huge sparse matrix that arises from log-likelihood estimate expression of a huge GMRF, computing which directly is not possible using traditional techniques due to computational overloads... 21:32 the technique that we're using for our purpose is to approximate matrix function (matrix logarithm to be specific) using techniques from numerical linear algebra and complex analysis.. 21:32 which results in a shifted family of linear systems that are to be solved, involving complex shifts... 21:32 as of now, using dense matrix linear operator and direct solving techniques, we have a working log-det estimator using this technique which gives a pretty good accuracy using Gaussian samples.. 21:33 in order to achieve this, we have developed a computation framework that forms several individual computation jobs which can be solved in parallel... 21:33 we have designed a sequential version of a computation engine which solves these jobs one by one.. 21:33 this framework can be really useful for other future purpose as well as we believe.. 21:33 we also have added iterative solve techniques that are particularly suitable for large sparse linear operators... various methods that follows conjugate gradient (CG) technique for solving for each of these shifts individually/simultaneously have been added... 21:33 future work involves implementing an iterative eigen solver for these huge sparse matrix (that requires Lanczos algorithm to be implemented).. this is needed for computing the shifts in the shifted system... (we used a direct eigen solver for dense systems in order to make sure the framework works properly) 21:33 we also have to use greedy graph coloring strategy to color the sparse matrix graph to obtain a set of vectors that are to be used instead of Gaussian vectors... 21:33 we may also have to use preconditioned CG solvers for each of the shifts in the systems in case we notice poor convergence behavior for our shifted family solver that solves for all the shifts at once.. 21:34 if time permits, we'll move towards a parallel implementation of our computation engine which will surely provide us a powerful mean of computation... 21:34 btw the parallel framework can be used by everyone 21:34 once it works 21:34 yes 21:34 for independent jobs that need to be solved at once 21:34 HeikoS: we should put everything on these rails 21:34 (I believe so!) 21:34 lisitsyn: yes I agree, or at least new stuff 21:34 since a lot of work 21:35 lisitsyn, lambday thats something to discuss once GSoC is more towards its end 21:35 but lots of possibilities, we had a long dicussion on the workshop 21:35 lambday: thanks! 21:35 HeikoS: alright.. I am really excited :) 21:36 so last but not least, foulwall, could you close the summaries? 21:36 ok HeikoS , here I am 21:36 I turned the shogun-demo site into a framework, and currently we can use the framework to make classification, regression, clustering demos. When creating a demo, the framework only needs a python dict to specify the style of web ui, and several decades of backend code to tell what algorithm to use. After that, the framework will generate the javascripts/htmls/css for the ui and connect the user input to the algorithm. 21:36 I made a demo for kernel matrix heatmap visualization. 21:36 foulwall: cool! I like heatmaps 21:37 I made a modular toy-data generator and importer, The generator can generate random sine data, though it only generate sine, but new function and arguments are easy to add to the module. The importer can import features and label from hdf5 files, and the demo can plot the data on the coordinate system, All the demos can use the data fed back from the generator and importer. 21:37 -!- nube [~rho@36.252.193.54] has quit [Ping timeout: 264 seconds] 21:37 I made a digit recogniser as http://shogun-toolbox.org/static/media/ocr.swf , now it works well. 21:37 foulwall: oh, maybe also have a look into the DataGenerator class then, MeanShiftDataGenerator for example. If you coded up your generators in C, they would be available from all interfaces 21:38 Now I'm working for some static dimension reduction demo, based on existing demo on http://tapkee.lisitsyn.me, thanks lisitsyn. 21:38 HeikoS: I'll use python interface for that. 21:38 foulwall: maybe lets discuss at some point, Ill write an email 21:38 HeikoS, yeah that was the idea to later switch to data generators 21:39 HeikoS, so we can have stand-a-lone demos to be the same as the web based one 21:39 foulwall: cool stuff, this will impress people, maybe talk to other GSoC students on demos on their projects (this could be one of your big examples) 21:39 nice 21:39 I'll boost up to finish all the idea under "Develop interactive machine learning demos" 21:39 yes I want to make my audio example on the web 21:39 foulwall, please give us a long README showing us how to do it for some algorithm 21:40 pickle27, wait for heiko's step 3) 21:40 foulwall, thanks! 21:40 ok then that were all students, any remarks? 21:40 ok sonney2k . I'll make it 21:40 If not, last point for today: 21:40 As said, we expect examples how to use your code: C++ and python modular at least. These examples should be small and illustrate how to use classes etc. 21:40 -!- nube [~rho@36.253.161.210] has joined #shogun 21:40 In addition, we would like every student to create a bigger example with a real-life dataset. This example should go a bit more into depth, explain the method more, play a bit around with it, visualise it. 21:40 We would like you to do those as an IPython notebook. 21:40 These allow to combine text/code/plots/latex in one file. 21:41 We are currently working on a way to automatically generate website pages from those. 21:41 These will look like the ones I presented at the workshop (but should be more detailed, explain more): 21:41 http://nbviewer.ipython.org/5982625 21:41 http://nbviewer.ipython.org/5982623 21:41 http://nbviewer.ipython.org/5982626 21:41 There are more examples on the web, they can also embed sound 21:41 with html sound player 21:41 sounds very cool 21:41 http://nbviewer.ipython.org/urls/raw.github.com/Carreau/posts/master/07-the-sound-of-hydrogen.ipynb 21:42 cool 21:42 The goal is the properly document your GSoC project using a notebook. 21:42 The examples are also a requirement to pass GSoC. We think this is a very cool oportunity to tell the world how cool your project is. 21:42 Questions? 21:42 HeikoS, how do we get them from our notebook folder to be on the web? 21:42 sounds great I like the notebooks idea 21:42 sonney2k: wiking and me are working on it 21:42 HeikoS, I added the gausskernel svm thingy to git but hmm.. 21:42 sonney2k: we will generate the full output automatically, store it and produce a link to the viwer 21:43 viewer 21:43 HeikoS, and we somehow need to connect this with the demos foulwall is doing - bidirectional 21:43 also, this detects api changes to notebook build will fail 21:43 so people can try interactively and also get more details 21:43 in the notebook 21:43 the notebooks are python code and can be downloaded /executed locally/interactively 21:44 https://notebookcloud.appspot.com/docs 21:44 connecting them with the web demo would be awesome, but havent thougth about this yet 21:44 something that may help 21:44 Cheng: thats a nice tool! 21:44 HeikoS, but we should. Everything we have as notebook should have a webdemo too 21:44 ok, I would agree 21:44 So every student, please start playing with the notebooks and start creating one, we want them to be ready before GSoC is over to give feedback. In particular, students can give suggestions to each other when something is unclear. 21:45 Ok, that was it from my side 21:45 awesome, I'll build my audio demo and notebook side by side! 21:46 anyone has some remarks? 21:46 HeikoS: that is a good idea, students giving feedback to each other 21:46 iglesiasg: yes, we will talk about this in another meeting after mid-term 21:46 ok, then the meeting is over 21:46 yeah it will help us really understand what everyones been doing! 21:46 I would like to understand better the techniques in some other projects, this looks like a nice way of achieving that 21:46 az_de, lisitsyn can we discuss in a new channel for a sec? 21:46 Feel free to discuss ideas :) 21:46 pickle27: yes sure 21:46 I have to rush off 21:46 shogun_bss 21:47 sonney2k: you have the lead now 21:47 all right, thanks HeikoS! 21:47 iglesiasg: what I wanted to ask you 21:47 iglesiasg: can we like 21:47 lisitsyn: tell me 21:47 -!- HeikoS [~heiko@nat-179-227.internal.eduroam.ucl.ac.uk] has quit [Quit: Leaving.] 21:47 estimate what is the best distance (among say euclidean, chi2 whatever) 21:47 according to LMNN distance matrix 21:47 lisitsyn: before I said if finds the best distance, I must add something in there however 21:47 iglesiasg: no it finds the best mahalanobis distance right? 21:48 lisitsyn: exactly 21:48 I mean can we say what is the best 'default' distance 21:48 like select 21:48 lisitsyn: D(x_i,x_j)=(x_i - x_j) M (x_i - x_j) 21:48 lisitsyn: the best M in ^ 21:48 lisitsyn: I am missing a transpose in the second difference of feature vectors 21:49 iglesiasg: I mean in MKL we select the best distance 21:49 err 21:49 kernel 21:49 with weighting 21:49 can we do the same - find what distance reproduces the distance computed by LMNN mostly? 21:49 there was a paper by Eric Xing about this 21:49 Cheng: haha you know everything 21:50 lisitsyn: I am thinking.. not sure if I see it 21:50 sonney2k, hey 21:50 iglesiasg: we have a dataset 21:50 we compute LMNN thing 21:50 and see - oh that's mostly euclidean 21:50 I'll use that 21:50 lisitsyn: ok 21:50 but then we take other dataset 21:50 and notice that LMNN computed something 21:51 sonney2k, so, what we plan to do with readers? 21:51 very similar to say inverse gaussian distance 21:51 iglesiasg: got it? 21:51 lisitsyn: yes 21:51 lisitsyn: then what you were saying would be if these two datasets are together 21:52 have to sleep now, cu guys. 21:52 lisitsyn: if we can weight between these two distances? 21:52 -!- nube1 [~rho@36.252.126.80] has joined #shogun 21:52 foulwall: bye bye! 21:52 iglesiasg: well weighting is other thing 21:52 but just find a distance that corresponds to the thing found by LMNN 21:52 -!- nube [~rho@36.253.161.210] has quit [Ping timeout: 276 seconds] 21:52 among some out of box distances we have 21:52 lisitsyn: so you mean going beyond the family of Mahalanobis distances? 21:54 lisitsyn: using the idea of LMNN but searching in other set of distances 21:54 iglesiasg: no just find the distance that is very similar to the best mahalanobis distance 21:54 it could be great performance wise 21:54 gsomix, yeah I still consider the linear reader + parser design more easy to understand. 21:55 lisitsyn: I am sorry, but I am not following :S 21:55 iglesiasg: once we perform LMNN 21:55 foulwall, nite and please continue your work! 21:55 we have d_1: X \times X -> R 21:55 that's the best mahalanobis distance 21:55 lisitsyn: ok 21:56 say we have {d_2, .... , d_N} - a set of distances 21:56 sonney2k, what to do with writing? I plan to add CWriter, CFileWriter and CStringWriter. 21:56 like euclidean etc 21:56 lisitsyn: ok 21:56 iglesiasg: can we found some distance among {d_2, ... , d_N} 21:56 that reproduces d_1 the best way 21:56 lisitsyn: ok, I understand now the problem :) 21:57 -!- foulwall [~user@2001:da8:215:c252:2d09:69b3:4be7:def2] has quit [Ping timeout: 245 seconds] 21:57 iglesiasg: I don't know any good criteria for that 21:57 but as Cheng said there is a paper 21:57 ;) 21:57 lisitsyn: well, there can be several 21:58 http://www.cs.cmu.edu/~liuy/distlearn.htm 21:58 lisitsyn: if you are representing your distances as all of them Mahalanobis, then you could define a distance measure between matrices 21:58 gsomix, not sure what you need there - I mean you could use CFile's functions directly 21:58 iglesiasg: no my point was to find some simpler distance 21:58 because mahalanobis is a bit slow 21:58 Cheng: thanks! 21:58 sonney2k, but line reader now is not only for lines with '\n' at end. because we have Tokenizer inside. why not allow line reader be more cool and read primitive types? 21:59 lisitsyn, do you know why SNP kernel still fails on the buildbot? 21:59 sonney2k, what functions do you mean? 21:59 gsomix, I mean for writing 21:59 sonney2k: hmmactually I just ran tests 21:59 lisitsyn, yeah me too 21:59 and it failed on SGVectorTest.complex64_tests 21:59 errm no all worked here 22:00 may be some old binaries 22:00 let me check again 22:00 lisitsyn: so how are you thinking these {d_2, ... , d_N } distances would be specified? 22:00 gsomix, for reading I can see the benefit of having a line reader and a separate parser to support other ascii line based formats 22:00 iglesiasg: by their distance matrices 22:00 Cheng: btw we had issue with our vspaces :D 22:01 lisitsyn: aham, so you would find the distance that reproduces d_1 the best way *given* a particular dataset 22:01 Cheng: you may noticed we had really dense things out there in paper 22:01 iglesiasg: yes exactly sorry I missed that point 22:01 -!- pickle27 [~Kevin@130.15.32.52] has quit [Quit: Leaving] 22:02 -!- az_de [57a25d66@gateway/web/freenode/ip.87.162.93.102] has quit [] 22:02 lisitsyn: then I think somewhat the same idea as before applies. You define a measure between matrices 22:02 sonney2k: are you leaving soon? 22:02 iglesiasg: natural measure is norm but I find it a bit worse 22:02 than something fance you can use 22:02 fancy* 22:02 van51, why? 22:03 lisitsyn: something fancy like? 22:03 iglesiasg: say accuracy 22:03 van51, did you get the same result like the kernel one? 22:03 sonney2k: I wanted to ask you a bit about the hashing features again :) 22:03 lisitsyn: no hacking latex 22:03 Cheng: yeah we had to remove it 22:03 I might be off soon 22:03 sonney2k, ok. but I'm still need class for buffered writing. 22:03 van51, just ask never ask to ask 22:03 and cut down some text 22:03 sonney2k: it could wait till tomorrow 22:03 sonney2k: anyway, first of all I didn't really get your email 22:04 Cheng: thanks for the pointers to ipython and metric learning! 22:04 sonney2k: I get that if the targer and the original dimensions are the same, then the hashing could be avoided 22:04 gsomix, I don't understand what for... if you use CFile you would just override set_matrix and done 22:04 van51, if I am not around I will reply when I read the logs so just ask :) 22:05 sonney2k, what functions I should use for writing? 22:05 lisitsyn: sure. I think many different ideas can make sense to measure similarity between distance matrices 22:05 sonney2k: ok :) 22:05 gsomix, the set_* ones 22:05 gsomix, the get_* ones are for reading 22:05 sonney2k, no, inside set_* 22:05 in realisation 22:05 gsomix, and set_* ones for writing 22:05 sonney2k: I just had a quick discussion with Benoit and he told me that the way I wrote the pseudocode in the email is the way to go for numerical features 22:06 gsomix, we are talking csv here right? so fprintf("%s," ...) 22:06 I need some tools for writing 22:06 iglesiasg: it might be that it is not worth implementing in C++ but it could be in notebook 22:06 sonney2k: so I will send a PR fixing what I was doing 22:06 sonney2k, is it fast for big data? 22:06 sonney2k: but I would like you to elaborate a bit on your email, what did you mean by inc step? 22:07 iglesiasg: like 'see, we can see the learned distance is very similar to D so we can use D' 22:07 van51, the thing you do in the first pseudo code 22:07 gsomix, file i/o is slower anyways and it is already buffered underneath 22:08 lisitsyn: yep, but I guess that LMNN gets more powerful when you don't easily know distances that are similar to LMNN's solution 22:08 iglesiasg: indeed but sometimes you need performance but don't know what distance to use 22:08 lmnn might guide you here 22:08 lisitsyn:  but I think I understand your idea for the notebook, as a way to illustrate what LMNN achieves 22:09 van51, which of the pseudocodes did benoit mean? I don't have the email here (please use my shogun-toolbox.org email address for that) 22:09 sonney2k: ah sorry about that 22:09 -!- travis-ci [~travis-ci@ec2-50-16-34-49.compute-1.amazonaws.com] has joined #shogun 22:09 [travis-ci] it's Sergey Lisitsyn's turn to pay the next round of drinks for the massacre he caused in shogun-toolbox/shogun: http://travis-ci.org/shogun-toolbox/shogun/builds/9399184 22:09 -!- travis-ci [~travis-ci@ec2-50-16-34-49.compute-1.amazonaws.com] has left #shogun [] 22:09 -!- pickle27 [~kevin@rcv3-lab-pc.ee.queensu.ca] has joined #shogun 22:09 sonney2k: the second one we discussed 22:09 sonney2k, I don't understand. we use own buffering for reading. because reading is slow. 22:09 sonney2k: do you want me to forward it to you? 22:09 van51, so does my reply make sense now? 22:09 all right, I think I am off to make dinner now. Will be back afterwards! 22:09 sonney2k: well you could still have index clashes for dense features 22:10 gsomix, yes reading huge blocks helps a lot 22:10 sonney2k: since you may map to a dimension lower than the original 22:10 gsomix, ok do some benchmark for some big file and then we will decide 22:10 sonney2k, ok 22:10 -!- Cheng [~yaaic@ip-109-45-0-25.web.vodafone.de] has quit [Ping timeout: 264 seconds] 22:10 gsomix, if it is necessary then we could have a similar thing to the reader for writing 22:11 gsomix, writing full lines or even 1MB chunks 22:11 van51, yes you can have clashes - actually always due to hashing 22:12 sonney2k: yes 22:12 van51, but that is not the issue here 22:12 van51, you know the indices for dense features 22:12 -!- iglesiasg [~Fernando@s83-179-44-135.cust.tele2.se] has quit [Quit: Leaving] 22:12 van51, so you know which to multiply with which 22:13 sonney2k: are you referring to quadratic coupling now? 22:13 van51, if there is a collision you don't want to increase its effect by multiplying with it 22:13 van51, yes - I thought that is what we are talking about? 22:14 sonney2k: no :) 22:14 sonney2k: actually we would get to that 22:14 van51, then please start fromt he beginning 22:14 sonney2k: so I had in my mind that hashing for dense features 22:14 sonney2k: w/o the quadratic coupling 22:14 ahh ok 22:14 sonney2k: would result in something like a discretization process 22:14 errm 22:15 what? 22:15 sonney2k: and I had treated the values of the new hashed representation as a counter -- similar to what I was doing in the doc features 22:15 van51, ok so same mistake then 22:15 sonney2k: I have a commit ready that fixes that 22:16 sonney2k: I just have to fix the unit tests 22:16 van51, very good 22:16 sonney2k: I will make a PR again now then and review it when you can 22:16 sonney2k: to make sure I didn't miss something again 22:16 sonney2k: Benoit told me would also have a look tonight 22:16 van51, you can again compare it with a polynomial kernel of degree 1 - once done on the hashed and on the non-hashed 22:16 sonney2k: tonight there that is :p 22:16 van51, did you understand what I wrote in my email in this context? 22:17 van51, I mean with the indices are given / we want to hash indices? 22:17 sonney2k: not really sorry :) 22:18 for me this is probably too obvious - I have been doing this for too long and I realize I suck explaining 22:18 van51, let me try again 22:18 sonney2k: isn't it just a new feature? I mean you hash its index and add there the combined value 22:19 van51, when you have dense features of dim D you know you have dims 1...D 22:19 van51, OK? 22:19 ok 22:19 van51, now with hashing you compress it down to sth like 2** 22:19 yea 22:19 I mean D -> 2**nbits 22:19 so you hash(i) i=1...D 22:20 in some for loop 22:20 and the dimensions you iterate over (1..D) are known upfront 22:20 van51, OK? 22:20 yeap 22:20 with the hasheddocfeatures it is totally different 22:20 you have strings as input 22:20 and you use n-grams 22:20 OK so far? 22:21 so you don't even have any dimensions 22:21 sonney2k: I wasn't referring to ngrams/tokens in that last pseudocode 22:21 sonney2k: but carry on 22:21 van51, what then? it was code from vw no? 22:21 sonney2k: no not yet, I was just trying to make sure that I had got right what the quadratic features for the dense would be 22:22 van51, please send me the email again 22:22 sonney2k: on it 22:22 I don't know what pseudocode we are talking about 22:22 sonney2k: hehe it's a bit vague this discussion :) 22:23 yeah well I told you I never have access to your email on my laptop 22:24 sonney2k: I just hit reply all on an old email 22:24 van51, and we name these 3 pseudocodes a,b,c so I know what you are talking about 22:24 sonney2k: didn't think about the address :/ 22:24 ok 22:24 nothing there yet... 22:25 van51, maybe just paste it somewhere 22:25 sonney2k: I've sent it, to your shogun address 22:25 dpaste etc 22:25 sonney2k: ok wait 22:26 http://pastebin.com/0GnacfQ5 22:26 sonney2k: ^ 22:26 van51, so which algorithm did you mean? 22:27 -!- shogun-notifier- [~irker@7nn.de] has quit [Quit: transmission timeout] 22:27 I thought you meant a) 22:27 sonney2k: a) is for the hashed doc features 22:27 sonney2k: c) would be quadratic for dense, right? 22:27 yeah that is what I was referring to 22:28 you need to compute the hash for some n-gram to get an index 22:30 good night people 22:30 gsomix, good night! 22:30 van51, you don't have an index like you have with dense 22:30 van51, agreed? 22:30 sonney2k: yes 22:31 van51, this doesn't make a difference for linear features 22:31 sonney2k: I think for tokens, vw concatenates them and hashes that 22:31 van51, because when you do a) and get the same h_idx multiple times you just add 1 to the value each time 22:32 OK? 22:32 -!- lambday [67157d36@gateway/web/freenode/ip.103.21.125.54] has quit [Ping timeout: 250 seconds] 22:32 yeap 22:32 if you go quadratic features though 22:32 you would need to know how often h_idx is the same 22:33 because you would do count[h_idx_1] * count[h_idx_2] 22:33 that is pretty annoying 22:34 van51, agree? 22:34 sonney2k: yes 22:34 sonney2k: I see that issue now 22:35 van51, you don't need that with dense 22:35 that is what I wanted to say 22:35 sonney2k: ok then :) 22:35 with dense you have the value and the index 22:35 sonney2k: well the answer for that is in vw, so I will read the code that Benoit suggested and see what I can do 22:35 sonney2k: yes idd 22:36 -!- FSCV [~FSCV@50.7.50.60] has joined #shogun 22:36 sonney2k: ok, so coming up : a PR to fix what I was doing wrong 22:37 sonney2k: and a PR for quadratic on dense 22:37 actually an update on thath 22:37 van51, I only see 2 solutions currently - keep all the h_idx around, sort them and then go over them or use some hashmap on them and then iterate over the values 22:37 van51, yeah does quadratic work now? 22:38 sonney2k: I had rolled back my code to fix that 22:38 sonney2k: so I will do it again now 22:38 fix what? 22:38 sonney2k: I'm confident that it'll be working though 22:38 sonney2k: what I was doing wrong in hashed dense 22:38 sonney2k: if you want to browse a bit through the commit now to understand what I mean, it's here : https://github.com/van51/shogun/commit/a46a05f62d0bc6779b20c87e6c41e22819ad92a5 22:39 van51, are we talking linear or quadratic now? 22:39 sonney2k: linear 22:40 b'coz I was talking quadratic :D 22:40 van51, but nevertheless comparing it to a linear kernel is an excellent test 22:40 sonney2k: ok. I know what to do now :) 22:41 or polykenrel degree=1 - which is the same as linear w/o 22:41 -!- lisitsyn1 [~lisitsyn@213.87.139.248] has joined #shogun 22:41 normalization 22:41 -!- lisitsyn [~lisitsyn@213.87.128.75] has quit [Ping timeout: 268 seconds] 22:41 sonney2k: yes yes 22:42 van51, is sparse* working or does this also need a fix? 22:44 sonney2k: the fix is in the same commit 22:44 for that 22:44 van51, that would also be a good test btw comparing dense and sparse - result should be 100% the same 22:44 sonney2k: idd it can be in an exampe to demonstrate the classes 22:46 van51, actually you do it exactly the way you need to do it for hasheddocfeatures there - computing the hashes first store in some list 22:46 van51, do a qsort 22:46 then count etc 22:46 -!- pickle27 [~kevin@rcv3-lab-pc.ee.queensu.ca] has quit [Quit: Leaving] 22:46 van51, so quadratic would work then with hasheddoc 22:46 van51, and I immediately have a feature request to work with sign(count) for the hasheddocfeatures 22:47 van51, so it would be 1 if a thing appears and not the real count 22:47 this was more stable in some apps I had 22:47 sonney2k: ok I will note that to add it 22:48 sonney2k: about the quadratic support 22:48 sonney2k: maybe I should start it from now as a preprocessor? 22:48 van51, it has to be some totally new framework I am afraid ... preprocessors are *very* old (written 2000) 22:49 -!- lisitsyn1 [~lisitsyn@213.87.139.248] has quit [Read error: Connection reset by peer] 22:49 sonney2k: ah I see 22:49 so they might need some polish and I am not sure they can handle Dot* 22:50 sonney2k: it can go under future work then :) 22:50 I mean what do we want 22:50 compress indices with hashing 22:50 and then maybe change values 22:50 with normalization 22:50 so this would have to hook into the add_to_dense_vec etc functions 22:51 not sure how 22:51 otherwise it would not be fast 22:51 sonney2k: true 22:51 van51, the preprocessor stuff currently takes a vector, changes it and returns it 22:51 changes == new vector but processed 22:52 so that would defeat the purpose of dotfeatures 22:52 sonney2k: ok I see 22:52 sonney2k: well I have a clear plan now again at least and I'm happy :D 22:52 van51, sorry for not looking in detail in the first place 22:54 and please keep it on :) 22:55 sonney2k: no worries. it actually helped me looked into it more 22:55 sonney2k: plus I also suck at explaining :P 22:55 ideal team :D 22:56 haha 22:56 -!- votjakovr [~votjakovr@host-46-241-3-209.bbcustomer.zsttk.net] has left #shogun ["Fallen asleep!"] 22:56 -!- travis-ci [~travis-ci@ec2-23-20-210-220.compute-1.amazonaws.com] has joined #shogun 22:58 [travis-ci] it's Heiko Strathmann's turn to pay the next round of drinks for the massacre he caused in shogun-toolbox/shogun: http://travis-ci.org/shogun-toolbox/shogun/builds/9401785 22:58 -!- travis-ci [~travis-ci@ec2-23-20-210-220.compute-1.amazonaws.com] has left #shogun [] 22:58 -!- hushell [~hushell@8-92.ptpg.oregonstate.edu] has joined #shogun 23:05 -!- iglesiasg [~Fernando@s83-179-44-135.cust.tele2.se] has joined #shogun 23:48 -!- mode/#shogun [+o iglesiasg] by ChanServ 23:48 --- Log closed Wed Jul 24 00:00:44 2013