Difference between revisions of "Cluster:LowLatency"

From Earlham CS Department
Jump to navigation Jump to search
(Notes: add tp_timer slideshow)
 
(9 intermediate revisions by 3 users not shown)
Line 1: Line 1:
==Plan (updated August 28, 2005)==
+
==Internal Links==
* Decide on two benchmarks, one micro (low-level) and one application, probably MD with GROMACS in a form that's latency bound on bazaar & Cairo.  Set these up on bazaar and Cairo and start tests.
+
* [[Cluster:LLK_Poster_Structure|Poster Structure]] (Moved to CVS/LaTeX)
 +
* [[llk-packet-diagram|Packet Diagram]]
 +
* [[Running Timer Code]]
 +
* [[llk QA]]
  
* Formal literature search and preliminary review.
+
==The Plan (updated January 28, 2006)==
 +
* Low-level benchmarks are netpipe (TCP) and pacgen (UDP).  High level benchmark is MrBayes.  Define run parameters for each set.  Setup scripts for PBS/Maui to run defined sets.  Run on cairo only.
 +
 
 +
* Formal literature search and review. Use/add to the keywords in llk/poster/abstract.tex.  Places to look:
 +
** Beowulf archives
 +
** ACM
 +
** IEEE
 +
** Citeseer
 +
** Google Scholar
 +
** tp_timer
 +
** NIST
  
 
* Test tcp_low_latency - does it work?  If so, what does it do _exactly_
 
* Test tcp_low_latency - does it work?  If so, what does it do _exactly_
  
 
* Show that latency is in the network stack, either by reference or by test (preferrably the former since it will take less time).
 
* Show that latency is in the network stack, either by reference or by test (preferrably the former since it will take less time).
==Packet Diagram==
 
 
[[llk-packet-diagram|Packet Diagram]]
 
 
==Running Timer Code==
 
 
[[Running Timer Code]]
 
  
 
==Notes==
 
==Notes==
Line 89: Line 95:
  
 
==Links==
 
==Links==
 +
*[http://cluster.earlham.edu/bugzilla/buglist.cgi?query_format=specific&order=relevance+desc&bug_status=__open__&product=llk Current LLK bug list]
 +
 
*[http://www.faqs.org/docs/Linux-HOWTO/Kernel-HOWTO.html Linux Kernel Howto]
 
*[http://www.faqs.org/docs/Linux-HOWTO/Kernel-HOWTO.html Linux Kernel Howto]
 
*[http://en.tldp.org/LDP/tlk/tlk.html The Linux Kernel Online Book]
 
*[http://en.tldp.org/LDP/tlk/tlk.html The Linux Kernel Online Book]
Line 94: Line 102:
 
*[http://www.linuxhq.com/ The Linux Headquarters]
 
*[http://www.linuxhq.com/ The Linux Headquarters]
 
*[http://kernelnewbies.org/ Kernel Newbies]
 
*[http://kernelnewbies.org/ Kernel Newbies]
 +
*[http://www.ee.unimelb.edu.au/staff/lha/Linux_network_stack_walkthrough.html Walkthrough of the 2.4 network stack]
 +
*[http://www.spine-group.org/papers/stack/stack2.6.html 2.6 stack (Italian)]
 +
*[https://svn.gnumonks.org/trunk/doc/packet-journey-2.6.xml 2.6 packet journey (hard to read)]

Latest revision as of 19:10, 28 January 2006

Internal Links

The Plan (updated January 28, 2006)

  • Low-level benchmarks are netpipe (TCP) and pacgen (UDP). High level benchmark is MrBayes. Define run parameters for each set. Setup scripts for PBS/Maui to run defined sets. Run on cairo only.
  • Formal literature search and review. Use/add to the keywords in llk/poster/abstract.tex. Places to look:
    • Beowulf archives
    • ACM
    • IEEE
    • Citeseer
    • Google Scholar
    • tp_timer
    • NIST
  • Test tcp_low_latency - does it work? If so, what does it do _exactly_
  • Show that latency is in the network stack, either by reference or by test (preferrably the former since it will take less time).

Notes

  • BibTeX references are in /cluster/project/llk/llk.bib
  • Comparisons & Instrumentation
  • Low Latency Kernel Option
    • net.ipv4.tcp_low_latency sysctl
      • /proc/sys/net/ipv4/tcp_low_latency
    • From /usr/src/linux/Documentation/networking/ip-sysctl.txt
       tcp_low_latency - BOOLEAN
       If set, the TCP stack makes decisions that prefer lower
       latency as opposed to higher throughput.  By default, this
       option is not set meaning that higher throughput is preferred.
       An example of an application where this default should be
       changed would be a Beowulf compute cluster.
       Default: 0

Ideas

  • Reduce error checking to improve performance
  • Reduce number of memory copies, maybe by not being fully TCP/IP-compliant?
  • Look into STP (scheduled transfer protocol)
  • Check out the TCP_NODELAY sockopt:
 if(setsockopt(sockfd, proto->p_proto, TCP_NODELAY, &one, sizeof(one)) < 0)
 {
   printf("NetPIPE: setsockopt: TCP_NODELAY failed! errno=%d\n", errno);
   exit(556);
 }

http://www.openldap.org/lists/openldap-devel/199907/msg00079.html

Links