Module `sktmls.models.contrib.generic_logic_model`

Classes

class GenericLogicModel (model_name: str, model_version: str, features: List[str], model=None, preprocess_logic: Dict[str, List[Any]] = None, postprocess_logic: Dict[str, List[Any]] = None, predict_fn: str = 'predict', data: Dict[str, Any] = {})

MLS 모델 레지스트리에 등록되는 단일 모델 기반의 클래스입니다.

전처리 로직과 후처리 로직을 json 형태로 전달하여 프로세스합니다.

Args

model_name: (str) 모델 이름
model_version: (str) 모델 버전
features: (list(str)) 피쳐 이름 리스트
model: (optional) ML 라이브러리로 학습한 모델 객체 (기본값: None)
preprocess_logic: (optional) (dict) 전달된 피쳐 x로부터 ML모델의 predict 함수에 input으로 들어갈 preprocessed_x를 만드는 전처리 로직 (기본값: 아래 참조)
postprocess_logic: (optional) (dict) ML모델의 predict 함수 결과로 얻어진 y로부터 리턴 body(items 리스트)를 만드는 후처리 로직 (기본값: 아래 참조)
predict_fn: (optional) (str) ML모델의 추론 함수 이름 (predict|predict_proba|none) (기본값: predict)
- none 전달 시 ML모델 추론을 사용하지 않습니다 (룰 모델).
- model_lib이 pytorch 인 경우 이 파라미터는 무시됩니다.
data: (optional) (dict) preprocess_logic과 postprocess_logic에서 피쳐와 함께 "var" 참조할 추가 데이터 (기본값: {})
- 피쳐 이름과 같은 키 존재 시 피쳐 값을 덮어쓰게 됩니다. 주의하세요!

Example

자세한 예제는 아래를 참조하세요.

https://github.com/sktaiflow/mls-samples/tree/main/GenericLogicModel


# 피쳐 이름
# 유저 프로파일에 존재하는 디멘전의 이름 리스트
features = ["feature1", "feature2", "feature3", "embedding_vector", "context_feature1", "context_feature2"]

# 전처리 로직
# 전달된 피쳐로부터 preprocessed_x를 만드는 로직을 정의합니다.
# 피쳐 값은 {"var": ["피쳐이름", 기본값]} 으로 참조합니다.
# 이해를 돕기 위한 예시이며 실제 문제와는 다를 수 있습니다. 그대로 사용하지 마시고 참고만 해 주세요.
# 전달하지 않은 경우 x 리스트를 전처리 없이 그대로 사용합니다.
preprocess_logic = {
    # float: 리스트의 모든 element를 float으로 캐스팅합니다.
    "float": [
        # merge: element들을 하나의 리스트에 합쳐 반환합니다.
        {"merge": [
            {"var": ["feature1", 0]},
            {"if": [
                {"==": [{"var": ["feature2", "N/A"]}, "S"]},
                1,
                {"==": [{"var": ["feature2", "N/A"]}, "V"]},
                2,
                0
            ]},
            {"var": ["feature3", 0]},
            # replace: 리스트 내 None을 0.0으로 교체
            {"replace": [
                None,
                0.0,
                # pick: 리스트의 특정 인덱스만 뽑아서 리턴
                {"pick": [
                    {"var": ["embedding_vector", [0.0] * 64]},
                    [1, 2]
                ]}
            ]},
            # weekday: 요일 리턴 (월요일: 1 ~ 일요일: 7)
            {"weekday": []},
            # day: 오늘 날짜
            # ndays: 이번 달 마지막 날짜
            # /: 나누기
            {"/": [{"day": []}, {"ndays": []}]},
            # get: 리스트 또는 딕셔너리에서 해당 인덱스 꺼내기
            {"get": [
                {"replace": [
                    "#",
                    "0.0",
                    {
                        # pf: 프로파일 조회
                        # - 타입: (`user`|`item`)
                        # - 프로파일ID
                        # - 조회할 user_id 또는 item_id
                        # - 조회할 키 디멘전
                        "pf": [
                            "item",
                            "device",
                            # additional_keys: Recommendation API(v3 이상)으로부터 전달된 추가 키 리스트
                            # additional_keys.0: 0번째 element
                            {"var": ["additional_keys.0", "ABCD"]},
                            ["sale_cnt"]
                        ]
                    }
                ]},
                0
            ]}
        ]
    }]
}

# 후처리 로직
# 계산된 y로부터 최종 리턴 body(items 리스트)를 만드는 로직을 정의합니다.
# 피쳐 값은 {"var": "피쳐이름"} 으로 참조합니다.
# y 값은 {"var": "y"} 으로 참조합니다.
# y가 리스트인 경우 {"var": "y.0"}, {"var": "y.1"} 등으로 참조합니다.
# 이해를 돕기 위한 예시이며 실제 문제와는 다를 수 있습니다. 그대로 사용하지 마시고 참고만 해 주세요.
# 전달하지 않은 경우 아래 기본값이 사용됩니다.
# [{"id": "ID", "name": "NAME", "type": "TYPE", "props": {"score": {"var": "y"}}}]
postprocess_logic = {
    "if": [
        # if
        # 25 <= age < 65 이며
        # five_g_yn == "N" 인 경우
        {
            "and": [
                {">=": [{"var": ["age", 0]}, 25]},
                {"<": [{"var": ["age", 0]}, 65]},
                {"==": [{"var": ["five_g_yn", "N"]}, "N"]},
            ]
        },
        # then
        {
            "list": [
                {
                    "dict": [
                        "id", "PRODUCT001",
                        "name", "상품001",
                        "type", "타입",
                        "props", {
                            "dict": [
                                "context_id", {
                                    "if": [
                                        # if context_feature1의 값이 Y이면
                                        {"==": [{"var": ["context_feature1", "N"]}, "Y"]},
                                        # "context_feature1"을 context_id로 사용
                                        "context_feature1",
                                        # else if context_feature2의 값이 Y이면
                                        {"==": [{"var": ["context_feature2", "N"]}, "Y"]},
                                        # "context_feature2"을 context_id로 사용
                                        "context_feature2",
                                        # else "default_context"를 context_id로 사용
                                        "default_context"
                                    ]
                                }
                            ]
                        }
                    ]
                }
            ]
        },
        # else if
        # 19 <= age < 25 이며
        # five_g_yn == "N" 인 경우
        {
            "and": [
                {">=": [{"var": ["age", 0]}, 19]},
                {"<": [{"var": ["age", 0]}, 25]},
                {"==": [{"var": ["five_g_yn", "N"]}, "N"]},
            ]
        },
        # then
        {
            "list": [
                {
                    "dict": [
                        "id", "PRODUCT002",
                        "name", "상품002",
                        "type", "타입",
                        "props", {
                            "dict": [
                                "context_id", {
                                    "if": [
                                        # if context_feature1의 값이 Y이면
                                        {"==": [{"var": ["context_feature1", "N"]}, "Y"]},
                                        # "context_feature1"을 context_id로 사용
                                        "context_feature1",
                                        # else if context_feature2의 값이 Y이면
                                        {"==": [{"var": ["context_feature2", "N"]}, "Y"]},
                                        # "context_feature2"을 context_id로 사용
                                        "context_feature2",
                                        # else "default_context"를 context_id로 사용
                                        "default_context"
                                    ]
                                }
                            ]
                        }
                    ]
                }
            ]
        },
        # else
        # None을 리턴
        None
    ]
}

my_model_v1 = GenericLogicModel(
    model=model,
    model_name="my_model",
    model_version="v1",
    features=features,
    preprocess_logic=preprocess_logic,
    postprocess_logic=postprocess_logic,
    predict_fn="predict"
)

# 정확한 로깅을 위한 사용 라이브러리 명시 (권장) (`sklearn`|`lightgbm`|`xgboost`|`catboost`|`pytorch`|`rule`|`etc`)
my_model_v1.model_lib = "lightgbm"
my_model_v1.model_lib_version = "2.3.1"

Ancestors

Methods

def predict(self, x: List[Any], **kwargs) ‑> Dict[str, Any]

Inherited members

MLSGenericModel:
- save
- set_model_lib
MLSTrainable:
- evaluate
- fit
- get_feature_importance
- get_model_names_persisted
- persist_models
- set_local_path
- set_mms_path
- unpersist_models