Tuesday, February 20, 2007

Report from FAST 2007: Data ONTAP GX Paper

The night before the presentation, Peter Corbett, Dan Nydick, and I worked on the slides Peter was to present. Peter then fine tuned them, arrived exactly on time to present the slides (much to the relief of everyone involved). But the wait was worth it as Peter definitely improved the product (I later presented the paper to data storage class at a university in northern California. You can view that version of the slides [sans performance data, for now at least] on my personal web site).

At the FAST presentation, there were several questions, which I feverishly attempted to paraphrase. Here they are, with the answers given, and in some cases, my color commentary (in italics):


Q: Was a single file system used in the performance charts (given during the presentation)?

A: A single namespace, at least one volume per D-blade, was used.

Q: Why doesn't it scale beyond 24 nodes? What happens at 25?

A: We stopped at 24 because we achieved our initial one million operations/second goal. We believe it will scale beyond 24.

Q: What can limit scaling?

A: The replicated coherent database can potentially be a limiter.

Also, I think the other limiter can potentially be the cluster interconnect, but so far switch vendors can build devices more than capable of switching dozens to low hundreds of nodes.

Q: What benchmark is used for CIFS numbers?

A: Currently there is no standard CIFS benchmark, and we didn't prepare CIFS number for the presentation.

Also our CIFS benchmark numbers use aggregate read and write as NFS do, and will be similar. Note that SFS 4.0 will provide CIFS performance measurements.

Q: Why is write throughput half the read throughput?

A: READs are faster because the benchmark uses sequential I/O, and READs can benefit from read ahead.

Q: For the load balancing mirror feature, aren't you worried about writing multiple mirrors?

A: The load balancing mirrors are read-only. Only the master of a mirror family is writeable.

In the presentation slides I've posted, I've attempted to make this
clearer.

You can read the paper at my personal website.

Monday, February 12, 2007

Data ONTAP GX paper at FAST 2007 this week

With Peter Corbett, Mike Kazar, Dan Nydick, and Chris Wagner, I submitted a paper on NetApp's Data ONTAP GX architecture and it was accepted for this week's FAST Conference. Peter is scheduled to present our paper this Thursday at 1:30 pm. (Apparently the venue is at the San Jose Marriott).


FAST '07

I'll follow up with a summary of audience questions and reactions.

Connectathon 2007

I'm trying (with mixed success) to travel less this year, and was going to skip Connectathon this year. However, I currently own the sessions portion of the NFSv4.1 spec, and several developers had issues and questions so I showed up for a few days. I didn't catch many presentations. Three presentations you might look at are Dave Noveck (one of my fellow NFSv4.1 specification editors) via his proxy Tom Talpey presented an excellent summary of new stuff in NFSv4.1 versus NFSv4.0. Ben Rockwood (of the cuddletech storage blog) discussed how he and his employer use NFS in what seems to be an OpenSolaris-only shop. Interestingly, Ben seems to be using bleeding edge OpenSolaris code which is a sharp contrast from my experience with how customers use Linux. Finally, Brent Callaghan of Apple discussed the NFS client and server changes in the upcoming Leopard release of Mac OSX. Brent's talk is a good reminder why a monoculture in the desktop computing space is bad thing, because Brent and his team produced a lot of interesting ideas and innovations. For example, Leopards adds Kerberized NFS support, joing Solaris, Linux, and AIX among the UNIX-like NFS clients, but rather than stick Keberos credentials in a ticket file, the tickets are kept per-user instance of the gssd daemon. BTW, Leopard will have a rudimentary NFSv4 client.