the old way
-----------
chained queries
tightly coupled code -- db abstraction layer is great, but avoid mysql,
too
no abstract scaling
the new way
-----------
data access separated from your data storage
services oriented architecture
data is requested from a service
data requests are ran in parallel
data requests are asynchronous
data layer is loosely coupled
scalability is abstracted
options
-------
requests over HTTP
NY Times DBSlayer
Danga's Gearman, a queue or weird version of map reduce. difficult to explain.
kinda like explaining memcache in 2002.
DIY
HTTP w/PHP
----------
1. Group requests for data at the top
2. Open a socket foreach request
a. sockets must be non-blocking
b. make sure to TCP_NODELAY
3 use __get() to block for results
4. see services digg request
DBSlayer
--------
Small HTTP daemon written in C
Uses JSON for communications
Connection pooling
Load balancing and failover
Tightly coupled to MySQL
Tightly coupled to SQL
no intelligence
Gearman
-------
Highly scalabe queuing system
Simple/Efficient binary protocol
Jobs can return results (e.g. data)
Sets of jobs are ran in parallel
Queue can scale linearly
PHP, Perl, Python, Ruby, C clients
Poorly documented
Not very "robust" -- opportunity for coding
Great for logging and crawling
Do It Yourself
--------------
Highly customized solutions (Flickr)
Extremely efficient for custom cases
Customize your protocols
Requires more resources
What goes in the services layer?
--------------------------------
smart caching
data mapping and distribution -- 1) db hash file, 2) mysql, 3) finances in orcl
intelligent grouping of data results
partitioning logic
DO WANT!
--------
Intelligently group data into endpoints
User End Point
user settings
user profile data
10 most recent friends
10 most recent diggs
Version Your endpoints
Bundle and group requests
EPIC FAIL!
----------
no teeny endpoints -- send lots o'data and carve it
Not running SOA requests in parallel
Net_Gearman
How do you transition over?
---------------------------
one framework
data access layer
abstracted query
migrate data by making user's data as read only
chain of responsibility pattern -- apc, memcache, http, mysql
dependency injection
What caused digg to go for SOA? Write saturation on the master. Sysadmins were not happy with site performance.
10,000 requests per second
phpcs for sniffing php for proper documentation
Comments (0)
You don't have permission to comment on this page.