GWI Go Challenge by Anastasios Nikolakopoulos#64
Conversation
|
Title: GWI Favorites Service - Complete Implementation Description:
|
lunemec
left a comment
There was a problem hiding this comment.
Hello Anastasios,
thank you for your submission!
Here are some questions for you. In general, if you can suggest how you'd improve this if it were to be production ready service, what it should have, what its missing, etc.
Thanks!
| COPY go_impl.go . | ||
|
|
||
| # Initialize module and download dependencies | ||
| RUN go mod init github.com/gwi-challenge && \ | ||
| go get github.com/google/uuid && \ | ||
| go get github.com/gorilla/mux && \ | ||
| go get github.com/lib/pq |
There was a problem hiding this comment.
Why did you choose this approach, over having go.mod and go.sum in the repository directly?
There was a problem hiding this comment.
Good question. I actually considered both approaches during development. The reason I went with dynamic initialization in the Dockerfile was simplicity for submission. For the context of this project's needs, having a go.mod file felt like extra overhead. The approach I used leaves Docker handle everything in one place. It downloads the exact versions I tested with, so there's no risk of version drift between local development and the container.
That said, you're right that in a production setup I should definitely commit go.mod and go.sum to the repository. I would have better version control and audit trail, and no re-downloading on every Docker build (among other advantages). For this exercise though, I prioritized getting it running quickly. For real deployment, having those files in git is the better practice.
| @@ -0,0 +1,1326 @@ | |||
| // Package main contains the GWI Favorites Service implementation in Go | |||
There was a problem hiding this comment.
Interesting choice to have the entire implementation in 1 file. Can you describe how you'd separate the file into different packages (no need to rework it now).
There was a problem hiding this comment.
Yeah, I get that question. The single file works fine for a challenge submission, but yeah, in production I would split it.
Here's how I'd organize it:
models/ - Data structures. User, Asset, Favorite, pagination types. Nothing but structs and constants.
storage/ - Database layer. All queries, connection logic, the Storage interface. This stays independent of business logic.
service/ - Business rules. The Service struct with CreateUser, AddFavorite, that kind of thing. It calls storage but doesn't know about HTTP.
handlers/ - HTTP endpoints only. RequestHandler with CreateUser, ListUsers, etc. Takes HTTP requests, calls the service, returns JSON.
main.go - Just wires everything together. Creates the database connection, instantiates the service and handlers, starts the server.
The key would be for each package to have one job. Handlers would not touch the database. Storage would not know about HTTP. Service would not care how data gets in or out. This would also make testing easier, too.
| // CreateAsset creates a new asset and returns its ID. | ||
| // Data is stored as JSONB for flexibility and queryability. | ||
| func (s *Storage) CreateAsset(assetType string, data json.RawMessage) (string, error) { | ||
| assetID := uuid.New().String() |
There was a problem hiding this comment.
Can you think of any issues with this?
There was a problem hiding this comment.
A potential issue that I am know thinking of is that, if the UUID is generated but the database insert fails, you've created a UUID that doesn't exist in the system. It's not a huge problem in this case since nothing references it yet, but it's not clean.
The better approach maybe would be to let the database generate it. PostgreSQL can use gen_random_uuid() as a default, so we could just insert NULL for the ID and leave the database to handle. Then we could query back the generated ID.
Another approach would be to generate the UUID here but only in a transaction. So, if the insert fails, the whole thing rolls back. That way the UUID and the row stay in sync.
I went with client-side generation for simplicity, but a server-side (database-generated) would be more robust for product-grade environments.
Shall you have other issues in mind, feel free to let me know :)
| -- Assets table (all types: chart, insight, audience) | ||
| -- Using JSONB for type-specific data keeps schema simple while remaining queryable | ||
| CREATE TABLE IF NOT EXISTS assets ( | ||
| id UUID PRIMARY KEY, |
There was a problem hiding this comment.
Are you aware of any potential downsides to using uuid?
There was a problem hiding this comment.
UUIDs are bigger than integers. A UUID takes 16 bytes, an auto-increment integer takes 4 bytes. Over millions of rows, that adds up in index size and memory usage.
They're also not sequential. Integer keys cluster naturally - rows created around the same time sit near each other on disk. UUIDs are random, so inserts jump all over the place. That can slow down index performance a bit.
The reason I went with UUIDs is because of their advantages. We can generate them client-side without hitting the database. We don't have to worry about ID collisions across services. And they tend to be better for distributed systems.
For this project, I went with the selection of UUIDs because the data volume is small, and the simplicity of not coordinating IDs across services is worth the tradeoff. In a production environment, if we were dealing with billions of rows and needed every ounce of performance, I'd reconsider.
| type Asset struct { | ||
| ID string `json:"id"` | ||
| Type string `json:"type"` // "chart", "insight", "audience" | ||
| Data json.RawMessage `json:"data"` // Type-specific data as JSON |
There was a problem hiding this comment.
How would you approach this if we wanted to validate each asset type's json structure ?
There was a problem hiding this comment.
Good question. Right now I'm just storing raw JSON, which is flexible but yeah, no validation.
I see two options. The first option would be for me to go with JSON schema. I would define a schema for each asset type (what fields are required, what types they should be). Then when an asset comes in, validate it against the schema before storing. Based on research I did now, there are Go libraries in github that do this. I would define schemas like:
chart schema: requires title, x_axis, y_axis, data array
insight schema: requires title, summary, tags
audience schema: requires name, size, demographics
Then in CreateAsset, before I insert, I would check the JSON against the right schema. If it doesn't match, then return a validation error.
The other approach, the second option, would be custom validation functions. A ValidateChart(), ValidateInsight(), etc. More code, but I would have full control over the logic.
For this project, I kept it loose on purpose. But in production, schema validation is essential. We wouldn't want garbage data getting stored just because someone sent malformed JSON.
| Description string `json:"description"` | ||
| } | ||
|
|
||
| if err := json.NewDecoder(r.Body).Decode(&req); err != nil { |
There was a problem hiding this comment.
What other ways to do this are there? When would you use which?
There was a problem hiding this comment.
There are a few ways to read JSON from a request body.
-
json.NewDecoder(r.Body).Decode()- This is what I used. It reads directly from the stream without loading everything into memory first. It is good for large payloads or streaming data. -
ioutil.ReadAll() then json.Unmarshal()- With this approach, I can read the entire body into a byte slice, then unmarshal it. It is simple and straightforward, and works fine for small requests. -
There are third-party libraries - Frameworks like Echo or Gin have helper methods that do similar things but with more sugar.
I picked the Decoder approach because it's efficient. I don't need to load the whole body into memory. In this project the payloads are tiny, so it doesn't matter much. But it's good practice for anything that might handle larger data.
If I am debugging and need to inspect the raw JSON, ReadAll() would be the better choice, because I can print it. If performance matters at all, I would say that Decoder is solid, because it's idiomatic Go and scales better if the payloads get bigger later.
|
|
||
| // ListAssets fetches all assets with pagination. | ||
| // Returns (assets, totalCount, error) | ||
| func (s *Storage) ListAssets(limit int, offset int, assetType *string) ([]*Asset, int, error) { |
There was a problem hiding this comment.
Why are there users, when everyone can read everyone's data? The asset table does not even save user_id.
There was a problem hiding this comment.
Fair point. Yeah, the way I built it, assets are global. Any user can see all assets. The users table just tracks who exists in the system.
The idea was to keep it simple for the challenge. In a real product, we would probably want asset ownership. Each asset would belong to a user, only they can delete it, maybe they would also control who sees it. That's when user_id in the assets table makes sense.
But the brief said users can favorite assets, not create them. It did not specify about ownership. So I treated assets as a shared library that anyone can browse and favorite. However, you could argue the users table is unnecessary then. Fair. The favorites table is what links users to assets. You could technically not have a users table and just reference user IDs in favorites.
I kept it for clarity though. It represents the concept "this user exists in the system", even if they haven't favorited anything yet. Cleaner than orphaned user IDs floating in the favorites table.
|
|
||
| // CreateUser creates a new user and returns the created user object. | ||
| func (s *Service) CreateUser() (map[string]interface{}, error) { | ||
| userID := uuid.New().String() |
There was a problem hiding this comment.
This is inconsistent. Once time uuid is created at the storage layer, but here it is on the service layer. What is the reason behind it?
There was a problem hiding this comment.
You caught an inconsistency. Yeah, that's a mistake on my part.
In CreateAsset I generate the UUID at the storage layer. In CreateUser I'm doing it at the service layer. I should've picked one and stuck with it. The right approach is probably storage layer. The storage layer owns the data, so it should own generating IDs. That keeps the service layer focused on business logic, not ID generation. Like a cleaner separation of "concerns".
In CreateUser, I should call s.storage.CreateUser() and let the storage handle the UUID. Follow the same pattern as assets. I think I did it inconsistently because I was refactoring and didn't catch it. Good catch, my bad!
| totalPages = 1 | ||
| } | ||
|
|
||
| return map[string]interface{}{ |
There was a problem hiding this comment.
Can you suggest improvement? Why is it like this?
There was a problem hiding this comment.
Right now I'm returning a generic map[string]interface{} with the pagination info nested inside. It works, but the issue is type safety. Go can't check at compile time if the keys exist, or if the values are the right type. When someone uses this response, they have to do type assertions.
I should define a struct for the response instead. Something like:
type ListAssetsResponse struct {
Assets []Asset `json:"assets"`
Pagination PaginationInfo `json:"pagination"`
}
Then return that struct. The JSON encoder handles it the same way, but now I get type safety. If someone tries to access a field that doesn't exist, the compiler catches it. I actually did this for PaginatedResponse in the favorites endpoints. I could have done the same here.
|
|
||
| ## Tech Stack | ||
|
|
||
| - **Language**: Go 1.21 |
There was a problem hiding this comment.
Any reason for 1.21 specifically? Current is 1.25.
There was a problem hiding this comment.
Honestly, 1.21 was just what I had installed locally when I started. There's no technical reason not to use 1.25. Since 1.21 is stable and battle-tested, I stuck with that. 1.25 has newer features and performance improvements. The code would work fine on either. If you want me to bump it to 1.25, I will change the Dockerfile and go.mod, and test the deployment :)
No description provided.