Temporary increase of limit to two WU at a time


Advanced search

Message boards : Neurona@home: Forum for users in English : Temporary increase of limit to two WU at a time

AuthorMessage
Profile Javier Villanueva
Project administrator
Project developer
Project scientist
Send message
Joined: Jun 12 11
Posts: 183
Credit: 747
RAC: 0
Message 347 - Posted 25 Apr 2012 11:39:36 UTC

    Last modified: 25 Apr 2012 13:41:34 UTC

    Gentlemen,

    the outcome of the hosts seems to be now stable on aprox. 24000 results per day. After requested and discussed with Mr. Zombie67 and Mr. Tex1954, I'm going to increase the limit of WUs to two. This means that hosts with only 4GB RAM but more than one core will probably have to make some adjustments in its BOINC client or they are going to run out of memory as the actual memory usage of the WUs is of about 2.5 GB each.

    This will be maintained for a while to check how the productivity improves and make some conclusions. As usual, I'll keep you informed of the results.

    This change will be effective tomorrow at 17:00 CET.

    Regards,

    Javier.

    zombie67 [MM]
    Avatar
    Send message
    Joined: Jun 14 11
    Posts: 16
    Credit: 807,065
    RAC: 1
    Message 348 - Posted 25 Apr 2012 13:16:24 UTC

      Good news! I am looking forward to the experiment.
      ____________
      Dublin, CA
      Team SETI.USA


      grayhoose
      Send message
      Joined: Sep 5 11
      Posts: 4
      Credit: 12,983
      RAC: 0
      Message 350 - Posted 26 Apr 2012 5:11:31 UTC

        so in preferences, app 1 & 2 are 1 or 2 WU's?

        Profile Javier Villanueva
        Project administrator
        Project developer
        Project scientist
        Send message
        Joined: Jun 12 11
        Posts: 183
        Credit: 747
        RAC: 0
        Message 351 - Posted 26 Apr 2012 8:29:37 UTC - in response to Message 350.

          so in preferences, app 1 & 2 are 1 or 2 WU's?


          Sorry, but I'm not sure of what your question is about. What do you mean exactly?

          Profile Odicin
          Send message
          Joined: Jul 9 11
          Posts: 2
          Credit: 48,638
          RAC: 0
          Message 352 - Posted 26 Apr 2012 10:14:38 UTC

            I think he meant to make it adjustable in the preferences. Probably to release a second application and adjust to run 2 of them per host. The first application is the "old" one per host.

            Another idea: Maybe it's possible to make a "hack" like Einstein, where it's adjustable, how many wu's run simultaneously at one gpu.

            Regards Odi
            ____________

            ich_eben*
            Send message
            Joined: Jun 22 11
            Posts: 11
            Credit: 48,770
            RAC: 0
            Message 353 - Posted 26 Apr 2012 11:40:59 UTC

              very nice.
              i got one rig with 16g (currently 8 - one ram stick showed errors in memtest86+).
              looking forward to 2 workunits.

              Profile Javier Villanueva
              Project administrator
              Project developer
              Project scientist
              Send message
              Joined: Jun 12 11
              Posts: 183
              Credit: 747
              RAC: 0
              Message 354 - Posted 26 Apr 2012 12:11:51 UTC - in response to Message 352.

                I think he meant to make it adjustable in the preferences. Probably to release a second application and adjust to run 2 of them per host. The first application is the "old" one per host.

                Another idea: Maybe it's possible to make a "hack" like Einstein, where it's adjustable, how many wu's run simultaneously at one gpu.

                Regards Odi


                The number of WUs is configured as a global parameter from the server side, there is no additional application to download.

                About the hack, I was told about this feature in some projects, but this is not available in our server, which uses BOINC "as it is". I guess is some add-on made by the project responsibles. I requested information to a couple of them about that but got no answer.

                zombie67 [MM]
                Avatar
                Send message
                Joined: Jun 14 11
                Posts: 16
                Credit: 807,065
                RAC: 1
                Message 355 - Posted 26 Apr 2012 15:59:01 UTC

                  I am suddenly getting download errors. I wonder if it has something to do with the change to two tasks at a time? Also, not yet getting two tasks at a time. FYI.
                  ____________
                  Dublin, CA
                  Team SETI.USA


                  Profile Javier Villanueva
                  Project administrator
                  Project developer
                  Project scientist
                  Send message
                  Joined: Jun 12 11
                  Posts: 183
                  Credit: 747
                  RAC: 0
                  Message 356 - Posted 26 Apr 2012 16:52:03 UTC - in response to Message 355.

                    Last modified: 26 Apr 2012 16:59:23 UTC

                    Could't reach the lab on time, reconfigured the server just now. I have checked that with my own host cluster (user IMM-UPV) and it is working fine. Two WU at a time and no download errors. It is situated outside the Falua server's intranet so the LAN is working fine.

                    For the record: I got 23780 results the last 24 hours and 253 in progress. Lets see how we improve the figures.

                    Profile cedricdd
                    Avatar
                    Send message
                    Joined: Jun 15 11
                    Posts: 12
                    Credit: 18,729
                    RAC: 0
                    Message 357 - Posted 26 Apr 2012 17:37:19 UTC - in response to Message 356.

                      Last modified: 26 Apr 2012 17:39:04 UTC

                      http://falua.cesfelipesegundo.com/Neurona/workunit.php?wuid=1406280
                      http://falua.cesfelipesegundo.com/Neurona/workunit.php?wuid=1405994
                      ____________
                      Kill all my demons and my angels might die too.

                      Tex1954
                      Send message
                      Joined: Feb 27 12
                      Posts: 20
                      Credit: 50,477
                      RAC: 0
                      Message 358 - Posted 26 Apr 2012 18:50:55 UTC - in response to Message 357.

                        Two WU's per system works better for me... all boxes have at least 8Gig memory and can handle it fine.

                        The problem is with BOINC not giving us control the way we need sometimes.

                        Also, as another mentioned, I see a lot of download errors now...


                        Win7-Compaq

                        485 Neurona@Home 4/26/2012 1:43:50 PM [error] MD5 check failed for Prueba20284681.pro_1335463899
                        486 Neurona@Home 4/26/2012 1:43:50 PM [error] expected d41d8cd98f00b204e9800998ecf8427e, got 0d02c4eb8be40a40ea32ce66df01a03e
                        487 Neurona@Home 4/26/2012 1:43:50 PM [error] Checksum or signature error for Prueba20284681.pro_1335463899

                        I saw this before on other projects and I think they mentioned something about a WU being sent out prematurely or something and it was an easy fix... they said.

                        Anyways, let's hope this experiment works well!

                        GOOD luck to us!

                        :D

                        Profile Javier Villanueva
                        Project administrator
                        Project developer
                        Project scientist
                        Send message
                        Joined: Jun 12 11
                        Posts: 183
                        Credit: 747
                        RAC: 0
                        Message 359 - Posted 26 Apr 2012 19:06:22 UTC - in response to Message 358.

                          This MD5 thing happens from time to time. I was blaming the ADSL the first time but there is no ADSL anymore. This affects some WUs and then dissapears. So far I haven't found a reason.

                          Tex1954
                          Send message
                          Joined: Feb 27 12
                          Posts: 20
                          Credit: 50,477
                          RAC: 0
                          Message 360 - Posted 26 Apr 2012 19:52:11 UTC - in response to Message 359.

                            Last modified: 26 Apr 2012 19:53:00 UTC

                            At one time they thought it was due to packet fragmentation on another website... not sure if that was the final answer or note... all I know is my ISP tends to break up packets longer than 1480 or so. My cable is therefore setup to use an MTU of 1400 and never any problems or real speed loss...

                            On another note, this is an 8-Gig mem AMD 955BE system running three tasks at same time... one of them in a VirtualBox Linux window as they all three climb to 2.2Gig of memory allocated... so far so good.

                            8-)

                            Crystal Pellet
                            Send message
                            Joined: Feb 16 12
                            Posts: 9
                            Credit: 41,534
                            RAC: 0
                            Message 361 - Posted 27 Apr 2012 15:02:12 UTC - in response to Message 356.

                              For the record: I got 23780 results the last 24 hours and 253 in progress. Lets see how we improve the figures.

                              That's looking very nice, Javier

                              Since raising the WU's limit to 2: Tasks in progress: ~468 on average

                              I too got a lot of "Error while downloading", but suddenly it stopped about 10 hours ago.
                              It looks like certain batches are involved.

                              Tex1954
                              Send message
                              Joined: Feb 27 12
                              Posts: 20
                              Credit: 50,477
                              RAC: 0
                              Message 362 - Posted 27 Apr 2012 19:17:21 UTC

                                Yup, no more DL problems lately and my output has doubled since Javier increased the limit to two.

                                So far so good!

                                8-)

                                PS: I also made an inquiry on the Einstein forum for some help packaging many tasks into one WU... no response yet...

                                Sightus@CAU
                                Send message
                                Joined: Jul 7 11
                                Posts: 26
                                Credit: 2,000,000
                                RAC: 0
                                Message 363 - Posted 27 Apr 2012 23:47:55 UTC

                                  Hi Javier,

                                  I am now testing the overall results according speed; I set up my quadcore with 4 VMs à 2 WUs and two host-WUs. These 10 WUs consume about 26-29GB RAM at the moment. In about 22 hours I will calculate the relative speed compared to a non-VM-framework.

                                  Regards sightus

                                  zombie67 [MM]
                                  Avatar
                                  Send message
                                  Joined: Jun 14 11
                                  Posts: 16
                                  Credit: 807,065
                                  RAC: 1
                                  Message 364 - Posted 28 Apr 2012 14:22:54 UTC

                                    Looks like the experiment is successful. Here is a live graph:


                                    ____________
                                    Dublin, CA
                                    Team SETI.USA


                                    Tex1954
                                    Send message
                                    Joined: Feb 27 12
                                    Posts: 20
                                    Credit: 50,477
                                    RAC: 0
                                    Message 365 - Posted 28 Apr 2012 15:24:30 UTC - in response to Message 364.

                                      Last modified: 28 Apr 2012 15:38:38 UTC

                                      Yes!!! So far for the 2Gig mem WU's, looks like most folks doubled their output and I would guess the project output will continue to tweak up next couple of days.

                                      Personally, I have no idea what the longer term requirements of Neurona may become, but if the 2Gig Mem WU size can be maintained, that would be awesome!

                                      I'm sure more folks would jump aboard with the 2Gig memory limit and that is nothing but good for the project. Also, the 2 WU limit should be maintaned until BOINC Client gets some method to control project "Instances" being run on multi-core boxes... been begging them for that a long time and continue to argue for that.

                                      OR, possibly allow the user to SELECT how many tasks to send in the Project Preferences setup. That may be more useful for folks running server-class boxes or some of the newer high-end 8->16 core CPU's.


                                      8-)

                                      Profile Javier Villanueva
                                      Project administrator
                                      Project developer
                                      Project scientist
                                      Send message
                                      Joined: Jun 12 11
                                      Posts: 183
                                      Credit: 747
                                      RAC: 0
                                      Message 366 - Posted 28 Apr 2012 16:06:19 UTC - in response to Message 365.

                                        Thank you all for your feedback. The results are pretty good, we have got more than 100% improvement. 61000 results in the last 24 hours. That's great.

                                        I can give you some information about the current computation. We have processed already about a half of the WUs, and there are still 600000 left. All of these are 2.5GB in size. Once they are over (maybe in 4 weeks or so?), my intention is to keep computing the rest of the WUs of the previous batch, you know, these of more than 10GB.

                                        Tex1954
                                        Send message
                                        Joined: Feb 27 12
                                        Posts: 20
                                        Credit: 50,477
                                        RAC: 0
                                        Message 367 - Posted 29 Apr 2012 13:13:01 UTC - in response to Message 366.

                                          Thanks for the warning! I presume that means back to one task per machine.

                                          That will mean I will only have 3 or 4 machines that can crunch them one at a time...

                                          8-)

                                          Tex1954
                                          Send message
                                          Joined: Feb 27 12
                                          Posts: 20
                                          Credit: 50,477
                                          RAC: 0
                                          Message 372 - Posted 30 Apr 2012 21:35:57 UTC

                                            Last modified: 30 Apr 2012 22:05:54 UTC

                                            Got reply from Einstein@home on how to bundle WU's together...

                                            http://einstein.phys.uwm.edu/forum_thread.php?id=9416&nowrap=true#116964

                                            This is not a general BOINC feature, it requires modifications of basically all project-specific components.

                                            - Make the workunit generator produce such "bundled" workunits, e.g. by grabbing a group of input files at a time from a pool
                                            - Add an "outer loop" to the application that allows to process n input files, one after the other, producing n output files
                                            - Return these n output files as n files of a single result. This involves constructing a "result template" for the transitioner with the correct number of result files.
                                            - Make the validator check / compare the results by pairwise checking / comparing all result files, probably by also adding a loop over all files of the results
                                            - In our case we modified our assimilator to rename the result files to look like these had been processed as individual workunits. Depending on what the assimilator is supposed to do and what is further done with the results on that project, this might not be necessary there.


                                            Looks like a programer has to do some work...

                                            If it was "me" trying to do this, I would kindly ask the folks at Einstein to provide a complete system and simply change whatever I had to change in the WU naming conventions as a start to test on an offline setup... One has to keep in mind that Einstein produces these work units for GPU's only, not CPU tasks.

                                            8-)

                                            Profile Javier Villanueva
                                            Project administrator
                                            Project developer
                                            Project scientist
                                            Send message
                                            Joined: Jun 12 11
                                            Posts: 183
                                            Credit: 747
                                            RAC: 0
                                            Message 373 - Posted 30 Apr 2012 22:55:45 UTC - in response to Message 372.

                                              Thanks, Tex.

                                              Basically agree with what the guys at Einstein say. Grouping many WUs in a single one is possible but would require many changes in the SW. I don't think I'm going to be able to do that in the short term due to lack of resources. Anyway once we move again to the "big" WUs this will be really not necesary.

                                              Sightus@CAU
                                              Send message
                                              Joined: Jul 7 11
                                              Posts: 26
                                              Credit: 2,000,000
                                              RAC: 0
                                              Message 375 - Posted 2 May 2012 11:47:48 UTC

                                                Hey Javier,

                                                is it possible to implement a counter on the server status page showing us the total amount of remaining WU per series?

                                                Regards,
                                                sightus

                                                Profile Javier Villanueva
                                                Project administrator
                                                Project developer
                                                Project scientist
                                                Send message
                                                Joined: Jun 12 11
                                                Posts: 183
                                                Credit: 747
                                                RAC: 0
                                                Message 376 - Posted 2 May 2012 20:29:50 UTC - in response to Message 375.

                                                  Hey Javier,

                                                  is it possible to implement a counter on the server status page showing us the total amount of remaining WU per series?

                                                  Regards,
                                                  sightus


                                                  No with the current implementation. I have the WUs in a separate hard disk and I upload a number of them and download the results periodically, so the server is really not aware of how many of them are left. I can inform about the number from time to time for your information. Currently there are aprox. 500000 WU left.

                                                  Sightus@CAU
                                                  Send message
                                                  Joined: Jul 7 11
                                                  Posts: 26
                                                  Credit: 2,000,000
                                                  RAC: 0
                                                  Message 377 - Posted 3 May 2012 7:47:57 UTC

                                                    Thats great!

                                                    zombie67 [MM]
                                                    Avatar
                                                    Send message
                                                    Joined: Jun 14 11
                                                    Posts: 16
                                                    Credit: 807,065
                                                    RAC: 1
                                                    Message 379 - Posted 4 May 2012 14:00:48 UTC - in response to Message 376.

                                                      No with the current implementation. I have the WUs in a separate hard disk and I upload a number of them and download the results periodically, so the server is really not aware of how many of them are left. I can inform about the number from time to time for your information. Currently there are aprox. 500000 WU left.


                                                      Maybe you need to upload some more now? Server is empty.
                                                      ____________
                                                      Dublin, CA
                                                      Team SETI.USA


                                                      Profile Javier Villanueva
                                                      Project administrator
                                                      Project developer
                                                      Project scientist
                                                      Send message
                                                      Joined: Jun 12 11
                                                      Posts: 183
                                                      Credit: 747
                                                      RAC: 0
                                                      Message 380 - Posted 4 May 2012 15:52:47 UTC - in response to Message 379.

                                                        The disk is almost full, need to free some space, hope to have this working again soon.

                                                        Post to thread

                                                        Message boards : Neurona@home: Forum for users in English : Temporary increase of limit to two WU at a time


                                                        Main page · Your account · Message boards


                                                        Copyright © 2013 CES Felipe II - Universidad Complutense de Madrid